



THE RATAN tATALIBRAR^ 

Cl. NO. V\^ 

Ac. No. \ \ Cj Date of release for loan 

This book should be returned on or before the date last stamped 
below. An overdue charge of one anna will be levied for each day 
the book is kept beyond that date. 




Statistical Adjustment 
of Data v*-' 


BY 


w EDWARDS DEMING, Ph.D. 

‘bureau of the CENSES and 
bureau of the budget 

WASHINGTON 


first edition 

Fourth Printing 


New York 

JOHN WILEY Sc SONS, INC. 

CHAPMAN & HALL, LID. 
London 



Copyright, 1938 , 1943 

BY 

W. EDWARDS DEMING 

All Rights Reserved 

This book or any part thereof must not 
be reproduced in any form without 
the written permission of the publisher. 


FOURTH PRINTING, SEPTEMBER, 1948 


PRINTED IN THE UNITED STATES OF AMERICA 




PREFACE 


The central thought in writing this book has been the adjust¬ 
ment of data, with emphasis on scattered portions of the topic that 
are difficult to find elsewhere, and which in my opinion are destined 
to assume increased importance in the future. Some of the topics 
that in the past have been thought to be important in statistics 
and least squares are conspicuously absent here, or receive only 
scant mention. It must be confessed that this circumstance 
arises partly by choice. 

The intention has been to produce a book for reference, and 
also for a text. Some differential calculus is used in the develop¬ 
ment of the general theory in Chapter IV, but it is not necessary 
to be able to follow this development in order to apply the recom¬ 
mended procedures, or to interpret the results of the calculations. 
The main prerequisite is knowledge and experience in the subject 
matter. 

The reader must not expect to find in this book an account of 
statistical methods for all occasions. It supplements: it does not 
supplant. There has not been in my mind any hope of covering 
the entire field of least squares. For instance, recent contribu¬ 
tions from Hotelling, Wald, and Churchill Eisenhart have regret¬ 
fully been omitted. An attempt to include them would have 
meant an unpredictable delay in publication. 

Possibly the reader will see here the interpretation of adjusted 
values in a new light, owing to my appreciation of the powerful 
stimulus of Shewhart’s contributions to statistical procedures and 
the philosophy of science. The student is first introduced to some 
basic statistical concepts, and in particular he is asked to view a 
method of adjustment as a way of arriving at a figure that can he 
used for a given purpose — in other words, for action. An abun¬ 
dance of procedures and skeleton table forms for numerical calcula¬ 
tion is* provided for immediate adaptation to many kinds of prob- 

iii 



IV 


PREFACE 


lems met in practice. It can be said that all of the recommended 
procedures have been tested in use, many of them in mass pro¬ 
duction. For the first time, a method for adjusting the observa¬ 
tions (finding the calculated points corresponding to the observed 
points) is provided for the circumstance in which both the x and 
y coordinates are subject to error. The insidious phenomenon 
of the instability of equations is introduced, even though inade¬ 
quately, and the reader can at least claim acquaintance with it. 

The successful introduction of sampling into the 1940 Census of 
Population, aside from being a manifestation of wisdom and fore¬ 
sight on the part of Dr. Philip M. Hauser, Assistant Director of 
the Census, and Dr. Leon E. Truesdell, Chief Statistician for 
Population, brought with it a host of unsolved statistical problems. 
One of these was the adjustment of sample frequencies to known 
marginal totals, solutions to which are given in Chapter VII. 
With the subsequent rapid growth of sampling in the conduct of 
many social and economic surveys of local and national scope, the 
inclusion of such methods may turn out to be timely. 

Different kinds of problems of adjustment (e.g., geodesy on the 
one hand and curve fitting on the other) are here unified and 
brought under one general principle and one solution. The dis¬ 
tinctions between different kinds of problems are left where they 
belong, namely, in the conditions that the adjusted values are 
subjected to (Ch. IV). Unfortunately and inadvertently, intellec¬ 
tual gulfs have grown up between writers in statistics, least squares, 
and curve fitting. Each of the three groups has gone its own 
way, rediscovering developments long since discovered by the 
others, or — what is worse — not rediscovering them. Here the 
reader will find contributions from all three groups, and he will 
perceive that they are complementary. 4 

The methods of this book were developed over a period of 
sixteen years in the government service, during which I have had 
the pleasure of assisting colleagues in many branches of science. 
The manuscript originated in notes kept during my statistical 
practice, and to meet the need of text material for classes taught 
in the Graduate School of the Department of Agriculture. A 
mimeographed edition of portions of this book appeared in 1938 



PREFACE 


y 


under the title Least Squares as a publication of the Graduate 
School. Many of the calculations and procedures were worked 
out by my wife, Lola S. Deming. A number of helpful comments 
came from Professor W. G. Cochran, who kindly read the galley 
proof. Extensive contributions in the text, and help in reading 
proof, have come from several of my colleagues and assistants in 
the Census, notably Mr. Samuel W. Greenhouse, now with the 
armed forces, and Mr. Jacob E. Lieberman. 

W.E.D. 

Washington 
August 1943 




CONTENTS 

PART A: SOME SIMPLE ADJUSTMENTS 


PAGE 

Chapter I On the Meaning of Adjustment. 1 

1. Some remarks on the problem of adjustment. 2. Random¬ 
ness and the importance of order. 3. Performing a simple 
adjustment. 4. Least squares adjustments often easy. 

5. Statistical methods and correction for biases. 6. Re¬ 
peated experimental results necessary for establishing a scien¬ 
tific law. 7. The nature of an adjustment. 

Chapter II Simple Illustrations of Curve Fitting 14 
. 8. The principle of least squares. 9. The simplest example of 
curve fitting — the single sample. 10. The same problem 
with unequal weights. 11. A digression to define weights. 

12. A more complicated problem — several samples. 13. The 
estimates of a, internal and external. 14. Comparison of the 
two estimates — analysis of variance. 15. Another simple 
problem — the slope of a line that is known to pass through the 
origin. 16. The t test for the slope. 17. The x coordinates 
subject to error, y free of error. 

PART B: THE LEAST SQUARES SOLUTION OF MORE COM¬ 
PLICATED PROBLEMS 

Chapter III The Propagation of Error. 37 

18. Small errors in functions of one variable. 19. Small 

. errors in functions of several variables. 20. The propagation 
of mean square error or variance. 21. The standard error of a 
mean. 22. A numerical example of small errors. 

Chapter IV The General Problem in Least Squares 49 
23. Outline of the problem. 24. The conditions. 25. Nota¬ 
tion for the derivatives. 26. The reduced conditions. 27. The 
method of Lagrange multipliers. 28. The general normal 

vii 





CONTENTS 


viii 

PAGE 

equations. 29. Short expression for S. The normal equa¬ 
tions are really normal. The matrix of the coefficients is posi¬ 
tive definite. 

PART C: CONDITIONS WITHOUT PARAMETERS 

Chapter V Geometric Conditions. 59 

30. Adaptation of the general solution to conditions without 
parameters. 31. Example: the plane triangle. 32. The 
plane triangle continued. The weights of the adjusted angles, 
and any function of them. 

Chapter VI Systematic Computation for Geometric 

Conditions. 70 

33. Steps in the formation of the normal equations. 34. Nu¬ 
merical example: a surveying problem. 35. Conclusions from 
the solution of the normal equations. 36. Shorter method of 
computing the weights of a large number of functions. 

Chapter VII Adjusting Sample Frequencies to Ex¬ 
pected Marginal Totals. 96 

37. Statement of the problem. 38. Cell frequencies and 
sampling errors. 39. Nature of the adjustment. 40. A 
closer look at the problem. 41. The least squares require¬ 
ment. 42. The two-dimensional problem. 43. A numerical 
example of the two-dimensional Case II. 44. The three- 
dimensional problem. 45. A simplified procedure — iterative 
proportions. 46. Iterative proportions in three dimensions. 

47. Simplification when only one cell requires adjustment. 

48. The Stephan method. 49. The Bruyere method. 50. Some 
remarks on the accuracy of an adjustment. 

PART D: CONDITIONS CONTAINING PARAMETERS 

Chapter VIII Curve Fitting in More Complicated 
Circumstances. 128 

51. Some general remarks on the purpose of curve fitting. 

52. Graphical considerations. 53. The conditions. 54. The 
L coefficients. 55. The normal equations for curve fitting. 

56. Adjusting the observations, or finding the calculated 







CONTENTS 


IX 


points. 57. The distribution of x 2 * 58. Some geometry con¬ 
cerning the adjustment of observations. 

Chapter IX Systematic Computation for Fitting 

Curves by Least Squares. 148 

59. Preliminary note on the tabular solution. 60. Systematic 
procedure for forming the normal equations for the parameters. 

61. Systematic solution of the normal equations. The recip¬ 
rocal matrix. Systematic computation of S. 62. The 
weights of the parameters; their standard errors. The stand¬ 
ard error of a function of the parameters. The standard error 
of a curve. 63. The error bands associated with a curve. 
Rejection of observations. 

PART E: EXERCISES AND NOTES 

Chapter X Exercises on Fitting Various Functions 172 
.64. Purpose of the chapter. 65. The line. 66. The parabola. 

67. The exponential and its logarithmic form. 68. The ex¬ 
ponential with a linear component. 69. The generalized 
hyperbola and its logarithmic form. 70. The hyperbola with 
a linear component. 71. Miscellaneous. 

Chapter XI Four Examples in Curve Fitting . 212 

Example 1. Fitting an Isotherm 
72. Formation and solution of the normal equations. 73. A 
note on instability. 

Example 2. Another Polynomial 
74. The observations and their weights. 75. A note on the 
observed values. 76. Formation and solution of the normal 
# equations. 77. The reciprocal solution. 78. Adjusting the 
observations. 79. The standard error of the calculated ordi¬ 
nates. 80. Calculation of the external estimate of <r. 

Example 3. A Formula Useful in Forestry 
81. The formula to be fitted. 82. Rewriting the function to 
gain an advantage. 83. Formation and solution of the nor¬ 
mal equations. 84. Numerical results. 85. Comments from 
Professor Francis X. Schumacher, Duke University. 





X 


CONTENTS 


PAGE 

Example 4. A Sample Survey of Canned Goods 
86. Object of the survey. 87. What the sample gives. 

88. The estimated inventory and its standard error. 89. Sum- 
• mary of the errors to be considered; effect on sample designs. 

Appendix: Tables for Making Random Observations 


for Class Illustration. 252 

Part A: Normal Deviates Directly in Units of the 
Standard Error. 252 

Part B: Normal Distribution of the Numbers from 
0000 to 9999. Class Interval .2 a . 255 

Index. 257 







Part A 

SOME SIMPLE ADJUSTMENTS 

CHAPTER I 

ON THE MEANING OF ADJUSTMENT 

1. Some remarks on the problem of adjustment. Before learn¬ 
ing how to use least squares, or any other method of adjustment, 
one might rightfully ask what is accomplished by procedures of 
adjustment, and what is the purpose of using them? 

In-the first place it must be recognized that any measurement is 
the result of doing something — applying some operation. Some 
procedure is carried out, and some number is written down as a 
result. In the second place it must be understood that the purpose 
of taking the measurement is to use it for doing something. The 
object of taking data is to provide a basis for action. 

If you were to measure a table with the idea of ordering a plate 
glass top for it, you would use a rule, tape, or yardstick, and 
measure it. The procedure of laying down the rule, counting the 
number of feet, estimating the number of inches and fractions of 
the last foot, and recording the figure, constitutes the operation of 
measurement. The action, in this case, consists of ordering a plate 
glass of a certain size. The measurement provides a basis for the 
action. If the measurement is wrong by so great an amount that 
the glass is unfit for the purpose intended when it arrives, then the 
figure has led us to the wrong action. 

You might repeat the operation of measurement, especially if 
the length is required to the nearest sixteenth of an inch. What¬ 
ever the exactness required, the problem is fundamentally the 
same. One takes a measurement — that is, one carries out an 
operation — and thus gets a certain result (a number), and writes 

1 



2 


SOME SIMPLE ADJUSTMENTS 


[Sec. 1] 


it down. Why should he repeat the operation? The answer may 
be contained in one or both of two statements: (a) to get a better 
value for the purpose intended, by adjusting the observations; 
(i b ) to gain some assurance that he is following the procedure 
intended. The latter is often more important, though also more 
difficult. Methods of adjustment assist us in both questions. 

As has been said above, the object of taking data is to provide 
a basis for action, and an adjusted value is a derived number that can 
be used for the purpose intended , if it is possible to be had from the 
data presented for adjustment. 

The principle of least squares provides a method for getting an 
adjusted value. It can be applied whether or not the data are 
worth adjusting, but the results are useful only when the data are 
good in the first place; no purely mathematical procedure can 
make a good figure out of any number of bad ones. Data not in 
statistical control — i.e., not random, are not usefully adjusted. 
It is important to know when data are worth adjusting. A partial 
answer will be arrived at in this section. 

Suppose that one were to repeat the operation of measurement 
n times, thus getting n numbers for the length of the table, denoted 
as x h # 2 » £ 3 , • • • 1 x n . The problem is to adjust these observations, 


Observation number Observed value 


1 

2 

3 


xi 

X2 

23 


n 


Xn 


i.e., to derive from them a number that can be used as the length of 
the table, for ordering the glass. Assuming that the procedure is 
being followed correctly, one must answer the question: would the 
mean of the n measurements be better than any one of the measure¬ 
ments drawn at random? Would the median of these n observed 
values be better than the mean? Would it be still better to 



[Ch. I] 


ON THE MEANING OF ADJUSTMENT 


3 


average the greatest and least of the n observed values? Why not 
just take any one of the observed values and use it? We shall 
proceed to some considerations that may help to provide useful 
answers to these questions. 

A statistician is expected to make better adjustments than any¬ 
one else. That is his business. However, the statistician must 
insist that he be rated, not on some individual adjustment, but in a 
population of adjustments , that is, on a “ long run ” of adjustments. 
If, in the long run he has a greater percentage of satisfactory results 
than anyone else could have gotten, then his method of adjustment 
is better. In isolated instances, his results may not be so good as 
those obtained by someone else, yet his method may be better in 
the sense just stated. 

Any measurement or any adjusted value is a prediction in the 
sense that the number that we are going to use for the length of a 
table is about what we should expect anyone else to get if he were 
to measure the table. As a matter of fact, every empirical scien¬ 
tific statement is a prediction, because, no matter how many times 
it has been confirmed in the past, it is always subject to future con¬ 
firmation by experiment. Any measurement is but one term in a 
sequence of terms (results) that actually or theoretically might yet 
be taken by repeated applications of the operation of measurement. 
It is important to realize that it is not the one measurement, alone, 
but its relation to the rest of the sequence that is of interest. We 
should not risk designating a measurement a measurement if we 
did not think that it could be duplicated within stated limits by 
future measurements, and that future action would bear out the 
usefulness of the number so designated. 

2. Randomness and the importance of order. In attempting to 
aijswer the questions that have been raised in the preceding para¬ 
graphs, let us make a chart, showing the results of repeating our 
operation of measurement. Let us plot the observed values as 
ordinates and the observation numbers as the abscissas. Suppose 
the chart has the appearance of Fig. 1. The observations show a 
trend. Under such circumstances should we take the average, 
median, or any other function of these observations for an adjusted 
value? The answer is no. Something is wrong with the procedure 



4 


SOME SIMPLE ADJUSTMENTS 


[Sec. 2] 


or the measuring instrument. The first thing to do is to find out 
what the trouble is. 

Let it be noted that the trend is recognizable only when there are 
a number of measurements. If only one or two measurements had 
been taken, the trend and the existence of any difficulty would not 
have been recognized, and the glass ordered would not fit. 

Now let us do something else. Suppose that each number in the 
above table is written on a poker chip. Let these chips be physi¬ 
cally similar, put into a bowl and thoroughly stirred, and then 



I 2 3 4 5 6 7 8 9 10 1112 13 14 15 n 
ORDER OF OBSERVATION 

Fig. 1. A chart showing the observed value plotted against the observation 
number. The observations exhibit a trend. 

drawn one at a time with shuffling between draws. With care, 
this operation of getting numbers will be a random operation , and 
will accordingly produce a random sequence. One actual random 
operation gave the sequence of numbers shown in Fig. 2. 

Just what is the difference between Figs. 1 and 2? The numbers 
plotted are the same, i.e., for every number shown in one chart, 
there is the same number in the other. What is different, then? 
The order of the observations is different. The original order has 
been destroyed by drawing the numbers from a bowl, and there is 
no more trend. On this account alone, the actions that would be 



[Ch. I] 


ON THE MEANING OF ADJUSTMENT 


5 


taken on the basis of the two charts are entirely different. The 
action based on the series of n measurements shown in Fig. 1 would 
be first to try to find out what is the trouble with the procedure of 
measurement that produced the n observations. Ordering the 
glass would come second: we should defer ordering it until we get 
better (more useful) measurements. On the other hand, if Fig. 2 
had been the result of the actual measurements, we could go ahead 
and order the glass at once. It is not alone the observed values 
that count; their relation to one another in the order of 'production is 


z 

i 

cc 

o 

Ll) 

3 

_J 

% 


4ft.-9§in. • 
4ft -9^in. ■ 



> 

cr 

Ll) 

CO 

CO 

O 


I 2 3 4 5 6 7 8 9 10 II 12 13 14 15 
ORDER OF DRAWING 


n 


Fig. 2. The observations here exhibit randomness. These are the same 
numbers as shown in Fig. 1, but their order has been made random by draw¬ 
ing them from a bowl. 


also important. 

When the trend of Fig. 1 occurs in actual observation with a 
measuring instrument, we immediately suspect something is wrong, 
and we try to find the difficulty. But if a trend like this were to 
occur under the ideal conditions of sampling (drawing the numbers 
from a bowl), we have no suspicions, but simply accept this result 
as one of the things that is going to happen once in a while. 

Let it be noted that trends and other patterns resulting from 
repeated observations occur not only in physical measurements, 
but also in the social sciences. For instance, in a survey that is 



6 


SOME SIMPLE ADJUSTMENTS 


[Sec. 2] 


repeated monthly, if a person repeatedly answers the same ques¬ 
tionnaire, his answer may vary from time to time, and may even 
show a trend, even though his status in life, measured by usual 
standards, has not changed appreciably. Repetition alone may 
be the cause of more careful attention to the details of the ques¬ 
tions, gradually bringing about a different evaluation of the 
same circumstances, resulting in a trend. Moreover, repetition 
of a question month by month may actually produce a trend; 
for instance, if a housewife weighs out her flour week by week in 
order to record for some survey the amount she uses, she may 
become flour conscious and gradually use more or less than she 
did before. 

Results like the points in Fig. 2 show stability, or randomness, 
and can be statistically adjusted to get a figure that can be used. 
It is to be noted that a rather large number of measurements is 
required before one can say that the operation of measurement is 
random. Visual inspection of the chart is often sufficient, but 
the more dependable Shewhart criterion of randomness 1 can be 
used if desired. The main thing is to have enough observations, 
and to plot them. With enough experience in using a particular 
method of measurement it may not be necessary to do this. The 
point is that before the observations can be adjusted, they must 
arise from a random operation. 

The adjustment itself may be a very simple procedure. It 
might consist of merely picking out any one of the n observations 
in Fig. 2 by lot, as one might be willing to do if (after randomness 
is assured) the measurements are all seen to lie within a band that 
is narrower than the requirement. Even if one of the n observa¬ 
tions is picked out by lot, the other observations of the sequence 
are not thrown away; they all provide information. Together 
they perform two functions: (i) they help to demonstrate the 
randomness of the operation of measurement, and (ii) they show 

1 The Shewhart criterion of randomness is described in his book entitled 
Economic Control of Quality of Manufactured Product (Van Nostrand, 1931); 
also in his book Statistical Method from the Viewpoint of Quality Control (The 
Graduate School, Department of Agriculture, 1939). It is described and used 
in the pamphlet entitled “ Control Chart Method of Controlling Quality during 
Production ” (American Standards Association, 29 West 39th St., New York, 
1942). 



[Ch. I] 


ON THE MEANING OF ADJUSTMENT 


7 


that the band of variation is so small that any one of them alone 
will suffice. 

The method of adjustment might of course be slightly more 
complicated. One might take the mean, or the median, of the n 
observations. The mean is in fact the least squares adjustment, 
as we shall learn in Chapter II. One could also conceivably split 
the difference between the greatest and least of the observations 
to get an adjusted value. 

The advantage of these slightly more complicated methods of 
adjustments is that if they are carried out for repeated sets of n 
measurements, the adjusted values so produced will fall within a 
narrower band than the band corresponding to the original obser¬ 
vations. For most random operations, the least squares adjust¬ 
ments will show the narrowest band of all, and this is a very prac¬ 
tical argument in favor of least squares. 

3. Performing a simple adjustment. Simple problems are al¬ 
ways best for illustration: if we can understand simple problems, 
there is some hope that we can under¬ 
stand more complicated ones. One 
of the best to look at from the stand¬ 
point of adjustment is a plane tri- 2 
angle in which the three angles have 
been observed by some angular meas¬ 
uring instrument, such as a transit 
or a protractor. In the triangle of 
Fig. 3 the three angles have been 
measured once each, with the results 
shown in the table. The observed 
angles add up to 179° 30'. Suppose 
we demand that the angles be ad- Fig. 3. The three angles of a 
jutted SO that their sum is 180° ex- triangle have been measured, 
actly. Two methods of adjustment The sum of the ad justed angles 
might at times suggest themselves. 13 forced to be 180 * 

Method 1. Distribute the 3(/ deficiency amongst the three 
angles in proportion to size. 

Method 2. Distribute the 30 7 deficiency equally amongst the 
three angles. 




8 


SOME SIMPLE ADJUSTMENTS 


[Sec. 3] 


Angle 

Observed 

Adjusted 

Method 1 

Method 2 

1 

120° 07' 

120° 27' 

120“ 17' 

2 

38 23 

38 29 

38 33 

3 

21 0 

21 04 

21 10 

Sum 

179° 30' 

180° 0' 

§ 

© 


The two methods of adjustment give the results shown in the 
table. Which do you prefer? Either is simple enough, and your 
preference will be easily settled, depending on the circumstances. 
If the protractor is correctly graduated, then the measurements 
of an angle may be randomly distributed about its true value, 
resembling in a fashion those in Fig. 2. Under such circumstances 
one would prefer Method 2. If on the other hand the protractor 
had been stretched in its manufacture so that the 180° index 
actually extends through more than half a circumference, then, 
though the measurements of any angle be randomly distributed, 
they will be distributed around a value that is too small. Under 
such circumstances, angular measurements need to be corrected 
by small additions, proportionate to the size of the angle measured. 
This is what Method 1 calls for. We can not say that either of the 
two methods is better; each has its place, depending on the cir¬ 
cumstances. (Cf. Sec. 5 also.) 

As we shall see later (Sec. 31), Method 2 is the least squares 
adjustment under the assumption that the protractor is correctly 
graduated. Thus, the method of least squares seems to lead to a 
simple and common sense procedure. It will be so wherever the 
problem is simple enough to visualize. In a later chapter we shall 
return to the problem of the triangle, in which the least squares 
procedure will be worked out for more complicated situations, in 
which the angles have been measured more than once, or an unequal 
number of times, and the sides may have been measured also 
(Sec. 34; pp. 74 ff.). 

Another simple example is a line that has been divided into 
segments, and some action is to be based on their lengths. The 












[Ch. I] 


ON THE MEANING OF ADJUSTMENT 


9 


observed lengths of the segments do not add to the observed length 
of the whole line, and an adjustment is required. A very simple 
procedure would be to apportion the excess or deficiency equally 
amongst the segments and the whole line: if the segments are in 
excess, their observed values will each be decreased by a certain 
amount, and the observed value of the whole line will be increased 
by the same amount, this amount being the excess divided by 
n + 1, where n is the number of segments. This adjustment is 
very easily applied. It actually is the least squares adjustment. 

More complicated problems of this t _ _ 

type occur when the segments are A 8 c o e 

not all measured the same number Fig. 4. The line AE and its 
of times, or are measured with instru- four segments have been meas- 
ments of different precisions. Such ured - The sum of the adjusted 
problems will form the object of later justed over _ al ] Iength 
attention, but we pause here to note 

that in this simple case the least squares procedure provides an 
easy and satisfactory adjustment. (See Exercise 2, p. 86.) 

In more complicated problems, it is not so easy to picture what 
happens in the adjustment, but we shall be able to apply the same 
principles in working them out. Problems of curve fitting are 
essentially the same nature as the geometric illustrations just 
given: in each problem the adjustment consists of altering the 
observed values in order to satisfy certain conditions that we 
decide to impose on the adjusted values. 

4. Least squares adjustments often easy. It is sometimes sup¬ 
posed that the method of least squares is more difficult than most 
methods to carry out. This is not always so; least squares is often 
the simplest and most satisfying of all known methods. In many 
prpblems, normal equations are not required. It all depends on 
what conditions are to be imposed, or how rigidly the user insists 
on fulfilling them. There are some problems in which least 
squares provides the only known method at any price, as for in¬ 
stance, complicated problems in triangulation and geodesy; also 
the adjustment of the observations in curve fitting when the 
weights vary and more particularly when both coordinates are 
subject to errors of observation. We have already seen some 



10 


SOME SIMPLE ADJUSTMENTS 


[Sec. 5] 


examples in which the least squares adjustment is simple and 
direct; for instance, it was noted in connexion with the adjust¬ 
ment of the observations in Fig. 2 that the least squares adjust¬ 
ment happens to be identical with the mean of the n observations 
— and calculating a mean is usually a simple enough procedure. 
We saw likewise that in the adjustment of the triangle by Method 2, 
the least squares adjustment turned out to be merely the equal 
distribution of the deficiency amongst the three angles. The least 
squares adjustment of the segments of the line in the last section 
is moreover simple enough, again being merely an equal distribu¬ 
tion of the deficiency or excess of the segments. Students who 
have fitted polynomials in the form of orthogonal polynomials will 
realize that the method of least squares, though perhaps not simple, 
is at least a routine matter, not involving the solution of normal 
equations. There is another illustration, contained in Chapter 
VII, in which tables of frequencies obtained by sample surveys 
are adjusted to expected marginal totals that are obtained from 
other considerations, such as a complete count; here again the 
adjustment can be made rapidly without the solution of normal 
equations. One could go on and point out many other problems 
in which the least squares adjustment is about as simple to carry 
out as any that could be devised, in view of the conditions imposed 
on the adjusted values. 

6. Statistical methods and correction for biases. There is 
another kind of adjustment, which might be referred to as an 
adjustment for bias. Laboratory instruments are often calibrated 
against a standard, and a correction factor applied to the measure¬ 
ments. Similar corrections are often required in canvasses in the 
social sciences. A mailed questionnaire, for example, usually 
requires corrections, because not everyone responds, and those 
that do not, form a class distinct from those that do. Moreover, 
the responses in a mailed questionnaire will be different from the 
responses in an interview questionnaire, even for questions worded 
identically. Increasing the size of the sample, or the number of 
observations, will decrease the sampling errors, but not the biases. 
A statistical adjustment is applied primarily to effect compromises 
with statistical fluctuations (sampling errors and errors of observa- 



ICh. I] ON THE MEANING OF ADJUSTMENT 11 

tion). A bias is never discovered or measured, nor has any mean¬ 
ing, unless two or more distinct methods of observation or experi¬ 
mentation are compared with each other. Statistical adjustments 
of data, together with the Shewhart statistical methods of quality 
control, are powerful tools in the detection of biases, difference 
in performance, deterioration or other changes in quality, the 
standardization of quality, and a host of important related 
problems. 

Simultaneous adjustment for bias and statistical fluctuations 
can often be made, as when sample frequencies constituting obser¬ 
vations on the breakdown of a certain class of the population are 
adjusted to the known total of that class (Ch. VII), or when a line 
is forced to pass through the origin because theory and other 
related knowledge of the subject tell us that it should (Sec. 15, 
p. 31). In the triangle problem of Section 3, Method 1 simul¬ 
taneously corrects for a stretched or compressed scale, and for 
statistical fluctuations of the measurements; but we should not be 
in position to choose between Methods 1 and 2 without knowing 
somehow or other from other experience with it whether the pro¬ 
tractor scale i. is uniformly stretched or compressed, or ii. can be 
considered perfect. 

6. Repeated experimental results necessary for establishing a 
scientific law. It would be splendid if all action required in social, 
economic, and industrial planning could be based on scientific laws; 
but actually, so many of the laws remain yet to be discovered that 
most action must necessarily be taken on the basis of knowledge of 
the subject matter in related fields. Of course, it is true that 
action is often prompted by prejudices and whims, even when a 
scientific basis for action exists, but this is a failing of human 
nature, hardly a problem in mathematics or statistics. 

No one experiment by itself establishes a law, or a valid basis 
for action. It is the consistency of repeated results under a variety 
of conditions that establishes a law. The method of least squares 
can be applied to a single set of data, but no matter how carefully 
the least squares adjustment is carried out, the curve so fitted, or 
the observations so adjusted, do not have scientific validity unless 
there is other evidence at hand to show under what conditions the 



12 


SOME SIMPLE ADJUSTMENTS 


[Sec. 6] 


same or similar results will be obtained, and how these conditions 
are to be brought about and controlled. 2 

A long series of experiments may provide the additional evidence 
that is needed, particularly if the different experiments of the series 
are performed under a variety of conditions (different tempera¬ 
tures, climatic conditions, economic levels, etc.). If the data in 
each experiment are random or nearly so (see Fig. 2 and discussion), 
and if the adjusted coordinates or the adjusted parameters in the 
fitted curve turn out to have about the same values, time after 
time, without fail, a scientific law may be considered established, 
and the conditions under which it holds may be stated. 

Thus, to be more specific, it is not the standard error of a slope, 
as estimated from a single set of data, but rather the persistent 
smallness of the standard error, or the persistent recurrence of the 
slope, in experiment after experiment, under a variety of condi¬ 
tions, that really attains scientific significance. By this we mean 
that useful 'predictions can be made regarding future slopes, and 
that we can say under what conditions these slopes will be main¬ 
tained. Repeated patterns lie at the basis of scientific significance. 
Repeated and repeatable good fits, and repeated and repeatable 
statistical significance, establish a scientific law. In science one 
is usually if not always studying the underlying system of forces 
(social, economic, mechanical, chemical, or whatever), in order to 
take action on the cause system, to regulate future product. 
Measurements or surveys already carried out on some one particu¬ 
lar batch of product (population of people, lot of industrial product, 
etc.), provide part but only part of the chain of evidence that is 
required for predictions with regard to data of the future. The 

2 “. . . it being justly esteemed an unpardonable temerity to judge the 
whole course of nature from one single experiment, however accurate or 
certain.” From Hume’s An Enquiry Concerning Human Understanding 
(London, 1748), section vii, part 2. 

“ But to argue, without analysis of the instances, from the mere fact that a 
given event has a frequency of 10 percent in the thousand instances under 
observation, or even in a million instances, that ... it is likely to have a 
frequency near to 1/10 in a further set of observations, is . . . hardly an argu¬ 
ment at all.” J. M. Keynes, Treatise on Probability (Macmillan, 1929), 
p. 407. 



[Ch. I] 


ON THE MEANING OF ADJUSTMENT 


13 


operationally verifiable meaning of a scientific law is a prediction 
of future results, not a statement of past results. 

Every experiment in a series should be designed and performed 
with care and judgment, even if many more experiments are 
required. Likewise, the data of each experiment should be sum¬ 
marized in the most efficient manner for comparison with the 
other sets of data. The importance of statistical theories and the 
design of experiments can not be overemphasized. Under condi¬ 
tions of randomness, the method of least squares usually provides 
a good summary of an experiment by preserving most of the 
information in the data, provided the right curve is being fitted. 
In some problems the method of least squares is simple and easy 
to apply; in others it is difficult (Sec. 4). In some kinds of 
problems, no other method is known. What method of adjust¬ 
ment to use (least squares, free hand curves, etc.) is as much a 
matter of economics as science, and must be decided on the basis 
of time, costs, and results. It is more important to insist on having 
a series of experiments carried out under a variety of conditions, 
than to insist on using any particular method of adjustment. 

7. The nature of an adjustment. A student of statistical theory 
may well wonder how the adjustment of data differs from other 
statistical calculations, and in particular the calculations that are 
performed in problems of estimation. A problem of adjustment 
might be identified as a problem in estimation in which the end 
product is a set of adjusted values , which have been forced (ad¬ 
justed) to satisfy certain conditions. 

It is these conditions that distinguish one problem from another. 
In the triangle problem of Section 3, and later on in Sections 31 
and 34, the sum of the adjusted angles is forced to be 180°. In 
ac^justing the line and line segments of Section 3, the sum of the 
adjusted segments must equal the adjusted total length of the line. 
In curve fitting (Figs. 16 and 17, pp. 132 and 133) there are likewise 
conditions to be fulfilled, because the adjusted observations are 
forced to lie on a so-called calculated curve. The principle of 
least squares (Ch. II) remains the same in all these problems, but 
the different kinds of conditions imposed on the adjusted values 
lead to different procedures in certain preliminary stages of the 
solution. 



CHAPTER II 


SIMPLE ILLUSTRATIONS OF CURVE FITTING 

8. The principle of least squares. Before going into the general 
problem of the adjustment of observations (Ch. IV), it will be 
helpful to apply least squares to some simple applications in curve 
fitting. Fortunately, simple problems afford nearly as much 
opportunity for thought in the field of statistical inference as the 
more complicated ones do. In all of them, simple or complicated, 
the principle of least squares requires the minimizing of the sum of 
the weighted squares of the residuals. This sum may be written as 

S = £ w res 2 (1) 

The summation (denoted by 21) of the weighted squares of the 
residuals is to be taken over all the observations that are subject 
to error. S is called the “ sum of squares/’ or, more explicitly, the 
“ sum of the weighted squares.” Weight will be defined in Sec¬ 
tion 11. 

In curve fitting, either or both of the x and y observations may 
be subject to error. Accordingly, S will be written explicitly 
for the x residuals alone, or the y residuals alone, or both, depend¬ 
ing on the experimental conditions. For instance, later on, 
when both the x and y coordinates are in error we shall write 

S = L (W X V X 2 + WyVy 2 ) (2) 

Here V x is an x residual, and V v is a y residual (see Fig. 17 on p. 133). 
If only y is subject to error, the first term on the right is to be 
omitted, and if only x is subject to error, the second term is to be 
omitted. In this chapter we shall be content to deal with a few 
simple problems in which only one coordinate has error. 

The principle of least squares is the minimizing of S . The 

14 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


15 


method of least squares is a rule or set of rules for proceeding with 
the actual computation. Here we shall try to learn both, and how 
to interpret the results. 

We may now define x 2 by the equation 



The symbol <r denotes the standard error of observations of unit 
weight (Sec. 11). 

Now since a is a constant in any one problem, x 2 is a minimum 
when S is a minimum; hence we may think of least squares not 
only as the minimizing of S, but also of x 2 - Least squares may 
also be considered the minimizing of the estimate a (ext), to be 
introduced in Section 13. Another way of looking at the problem 
is to say that the principle of least squares is the maximizing of 
P( X ), and that we seek the solution that gives the greatest prob¬ 
ability on the chi-test. 

Remark . It is interesting to recall Gauss’ Theoria Motus 
statement of the principle of least squares, in particular his 
recognition of the occasional need for compounding errors of 
different dimensions (seconds of arc, seconds of time, length, 
weight, etc.). In curve fitting, this compounding is exempli¬ 
fied as explained above, namely, by taking account of the errors 
in both the x and y coordinates, when both are subject to error, 
just as one would take account of the errors in both the angles 
and the sides of a triangle (Sec. 34). The following quotation 
from Gauss is taken from his Theoria Motus Corporum Coelestium 
(Hamburg, 1809), Art. 179. His h is weight , written in Eq. 2 
and elsewhere as w. His sum hhw + h'h'v'v' + h"h"v"v" -f- 
• • • is the S of Eq. 1. 

• “ . . . quamobrem systema maxime probabile valorum pro 

quantitatibus p, q , r, s, etc., id erit, ubi aggregatum hhw -f 
h'h'v'v' + h"h"v"v" -f etc., i.e., ubi summa quadratorum dif- 
ferentiarum inter valores revera observatos et computatos per 
numeros qui praedsionis gradum metiuntur multiplicatarum fit 
minimum. Hoc pacto ne necessarium quidem est, ut func- 
tiones V, V', V", etc., ad quantitates homogeneas referantur, 
sed heterogeneas quoque (e.g., minuta secunda arcuum et 
temporis) repraesentare poterunt, si modo rationem errorum, 
qui in singulis aeque facile committi potuerunt, aestimare licet.” 



16 


SOME SIMPLE ADJUSTMENTS 


[Sec. 9] 


This was Gauss’ enunciation of the principle of least squares 
in 1809. In 1823, in his Theoria Combinationis Observationum 
Erroribus Minimis Obnoxiae, he took the view that one seeks 
values for the adjusted observations and parameters which 
render the variance of the parameters a minimum. Both points 
of view are arbitrary, and are justifiable only in experience. 
Fortunately, both points of view lead to the same identical least 
squares solution. An article by A. C. Aitken and H. Silver- 
stone, “ On the estimation of statistical parameters ” ( Proc . 
Royal Soc. Edinburgh , vol. lxi, 1942: pp. 186-194), is instructive. 

9. The simplest example of curve fitting — the single sample. 

Let a single sample of n controlled 1 (random) observations be 

x 

34 

33 

k 

32 

31 

30 

I 23456789 10 

Fig. 5. Ten observations 

Xi, X2, * * ' , Zio 

of equal precision (equal weight) are made on an unknown magnitude «. 
The true points are connected by the simple relation x — a, hence x = a is 
the curve to be fitted. The least squares value of a turns out to be x, the 
mean of the ten observations, x = a. is the “ true curve,” and x = x is the 
“ calculated curve.” The calculated points are shown by the crosses; they 
all lie on the calculated curve. This is the simplest problem in curve fitting. 

Compare with Fig. 6, page 25, and with Fig. 16, page 132. 

made on some magnitude, such as the length of a table. Suppose 
that it is desired to derive an adjusted value a, from these observa- 

1 For a partial explanation of controlled observations and randomness, 
see the first chapter. 





[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 17 

tions. We look upon the problem as one in fitting the curve 

x = a (4) 

to the n observations. This is the simplest of all curves; it is 
merely a horizontal line (Fig. 5). It contains but one adjustable 
constant or parameter, namely, a. This parameter a is now to be 
determined. The method of least squares will be illustrated. 

The problem is to minimize S, the sum of the weighted squares 
of the residuals. The observations are all of equal weight, since 
by supposition they appear to have been drawn all from the same 
bowl. We shall therefore let all the weights be unity. If x 
denotes an observation, then x — a is the corresponding residual , 
since, by definition, 

Residual = Observed — Calculated 

The square of the residual will be (x — a) 2 ; hence the sum 

S = £ (x — a) 2 (5) 

will be the quantity to be minimized. The sign £ means that 
the squares of all the residuals are to be summed. 

Here the y coordinates of the points are merely the ordinal num¬ 
bers of the observations (1st, 2d, 3d, etc.). The y coordinates are 
of course without error here, so only x residuals appear in the expres¬ 
sion for S. 

Now the n observations, having once been made, can not be 
changed. They are constants. The only variable in Eq. 5 is the 
adjustable parameter a. By giving various values to a, S is made 
to take on various values. There will be a minimum, and it will 
occur when the derivative 

g=- 2 L(*-°) 

vanishes, that is to say, when 

2Z (x — a) = 0 or £ x — na = 0 


(6) 



18 


SOME SIMPLE ADJUSTMENTS 


[Sec. 10] 


The least squares value of a is accordingly 

a = - = i = the mean (7) 

n 

So the horizontal line x = x in Fig. 5 gives the smallest possible 
value to the sum of the squares of the residuals, and hence to x 2 - 
The line x = x is the “ calculated curve it is the “ curve ” 
x — a fitted by least squares. On this line lie all the “ calculated 
points,” these being the least squares estimates of the observed 
points. (In this problem, the calculated points all have the same 
ordinate, namely, x. Compare Fig. 5 with Fig. 17, p. 133.) 

The fit of this line may be judged by comparing the value of 
X 2 /(n — 1) with <t 2 . This is done by looking up P(x) for the 
observed value of x 2 corresponding to n — 1 degrees of freedom. 
This subject will be touched upon again in Section 13 and else¬ 
where. Tables for the use of the chi-test will be found in R. A. 
Fisher’s Statistical Methods for Research Workers (Oliver and 
Boyd). 

Note that the value of a is not required for the application of 
least squares, because whatever o- is, x 2 is a minimum when S 
is a minimum, <r did not occur in Eq. 6. <r is required, never¬ 
theless, for the use of the chi-test for the fit of the curve. It is 
presumed to be obtained from previous experience with the 
measuring instrument. By the time enough data are gathered 
to attain and test for randomness, a will be known closely 
enough. 

Note also that if s denotes the standard deviation of the n 
measurements, then x 2 = ns 2 /a 2 , and the minimized value of 
S is ns 2 . For a new sample of n observations there will be a 
new mean, x, a new line, and a new x 2 - Any one value of x can 
form a basis for action only if there is evidence that future 
values of x would be closely the same. This evidence must * 
come from experience with the procedure of measurement. 

10. The same problem with unequal weights, (a) Direct solu¬ 
tion. In the preceding section, all the observations had equal 
weight, 2 they were “ drawn from the same bowl ” (Ch. I). Sup¬ 
pose now that the n observations x%, x 2) • • •, x n have weights w\, 

2 The meaning of weight will be learned in the next section. 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


19 


w 2 , • • • , Wn, perhaps not all equal. The observations are now 
drawn from bowls having the same mean, but perhaps various 
standard deviations. The procedure is formally very similar to 
what it was before. We are now to make the sum of the weighted 
squares of the residuals a minimum, so we write 

S = £ Wi(xi - a ) 2 (8) 

the tth residual being, as before, X{ — a. Here, the weight Wi is 
introduced because the weights are not all unity. As before, S 
is to be a minimum with respect to a. The derivative 

dS 

~T = -2 Z w*(z» ~ a) 


when set equal to zero gives 

£ Wi(xi — a) = 0 


or 

whence 


C I ^ = I WiXi 


Z WjX t 

Z Wi 


X 


(9) 

( 10 ) 


where x is now the weighted mean of the n observations. In the 
event that W\ = w 2 = • • • = w n) this result reduces to the previous 
value of a in Eq. 7. In other words, the problem of the preceding 
section (equal weights) was a special case of this one. 

The minimized value of S is here 

S - £ Wiixi — x ) 2 = Z WiX 2 — x 2 J2 Wi (Seep. 151.) (11) 

(6) Tabular solution. In Section 61 we shall see a systematic 
procedure for the solution of normal equations and for calculating 
th3 “ reciprocal matrix,” in which are found the variance and 
product variance 3 coefficients; also we shall see the minimized 
value of S calculated right along with the solution of the normal 
equations. In simple problems like the one just considered, there 
is only one normal equation (Eq. 9) and it is of course very easily 

8 Following Aitken, the term product variance is used here rather than 
covariance. 




20 


SOME SIMPLE ADJUSTMENTS 


ISec. 10] 


solved (see Eq. 10). Nevertheless, it is interesting to see how the 
routine process that is to be shown in Section 61 applies here. 
Let us therefore set up the following tabulation, and perform the 
steps indicated below. The subscript i in the summations has 
been omitted for the sake of brevity. 


Row 

a — 1 

c 

I 

52 w 52 m 

1 

2 

2 wx<i 

0 

3 

(]£ wx) 2 

£ w 


II 

_ , £ wx) 2 

2* wx 2 - ^ 

2* w 



An ellipsis (• • •) in the tabular array denotes a space wherein a 
number would ordinarily be entered in numerical calculation, but 
in which it is not worth while to show the entry in symbols. 

Row I is the main equation; it is equivalent to Eq. 9. Each 
letter across the heading of the tabulation is to be multiplied by 
the coefficient standing below it in Row I. Row 2 contains the 
sum of the weighted squares of the x values, measured from zero. 
The C column is filled in as shown. Row 3 comes by multiplying 
Row I through by — 22 wz/52 w. Row II comes by adding 
Rows 2 and 3. In the “ 1 ” column of Row II is found the quan- 
tity 

S = 52 WX/2 ~ or £ wx 2 — x 2 52 w 

52 w 

as already derived in Eq. 11. S is here seen as the initial sum t of 
weighted squares, £ wx 2 in Row 2, reduced by the amount 
(£ wx) 2 /^ w to take account of the fact that the residuals are 
finally measured from the fitted line x = a, instead of from 0. 
It is interesting to perceive that this minimized value of S has 
come forth without the intermediate step of computing each residual 
after adjustment, then squaring it, and then adding them all 
together. Similar short cuts, due to Gauss, will be found also in 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


21 


the more complex problem, as will be seen later. 
59, and 61; also compare with Sec. 156.) 

Row I solved for a gives 


a = 


2 wx 

22 w 


(See Secs. 29, 


in agreement with Eq. 10. However, if we use the C column in 
place of the “ 1 ” column in solving for a, we get 1/2 w. Inter¬ 
preted, this means that 


1 _ _ 1 

w a 2 w 


( 12 ) 


This solution can be looked upon as the one and only term in the 
reciprocal matrix (to be encountered later in extended form; 
Sec. 61). This one term is the variance coefficient of a, which 
interpreted means that 

<r 0 2 = (S.E. of a) 2 = — = ~ (12') 

w a 2 w 


The standard error of a thus decreases as the weights of the indi¬ 
vidual observations increase. This equation will be understood 
better after the discussion on weights has been read (next section). 

11. A digression to define weights. By definition, the weight Wf 
of the function / is inversely proportional to the variance a 2 of /. 
That is to say, l/w/ is the variance coefficient of /. In symbols, 

a 2 a 2 

Wf = —5 or a/ 2 = — (13) 

1 w f 

<j 2 is simply a proportionality factor, and is evidently the variance 
of a function of unit weight. If <j 2 be arbitrarily doubled, and w / 
also doubled, a 2 is unaffected in value. 

Jjfor example, let / be x , the mean of the n observations X\ y 
X 2 , • • •, x n , which are random variates taken from a universe of 
standard deviation a, hence each of unit weight. Then, since the 
variance of x is <J 2 /n, substitution of x for / in Eq. 13 gives 
er 2 



22 


SOME SIMPLE ADJUSTMENTS 


[Sec. 11] 


whence we see that n is the weight of and 1/n its variance 
coefficient. Or, if the n original observations were each of weight 
w instead of unity (as we could as well say, since weights are 
relative and not absolute, depending as they do on the arbitrary 
factor o’ 2 ), then the variance of single observations would be <? 2 /w, 
and the variance of x would be one nth as much. In this case, 
therefore, Eq. 13 gives 

2 2 

(T _ (T 

m = nw or <r x = — (15) 

* <T 2 nw 

nw 

saying that nw is now the weight, and 1 /nw the variance coefficient, 
of x. So, as before, the weight of x is just n times the weight of a 
single observation. 

“ The primal conception of a weight is that of a repeated obser¬ 
vation .” 4 In Fisher’s terminology, the mean x of n observations 
contains n times as much information as a single observation. 

Concerning two functions fi and / 2 , it can be said at once from 
Eq. 13 that 

wi : w 2 = cr 2 2 : v\ 2 (16) 

which says that the weights of two functions are inversely pro¬ 
portional to their variances. 

Exercise 1 . If Vi denotes the residual at point i, Wi the weight 
of the observation, then x 2 = (1 /<? 2 ) 22 w7 2 . Show that this may 
be written 



which says that x 2 is the sum of the squares of the residuals, each 
residual (F t ) being measured in units of the standard error <r/y/w 
of the corresponding observation (compare with Eq. 20, next sec¬ 
tion). In other words, x 2 is the sum of the squares of the stand¬ 
ardized residuals, x 2 is therefore independent of the units used in 

4 E. B. Wilson and Ruth R. Puffer, “ Least squares and laws of population 
growth,” Proc. Amer. Acad. Arts and Sci. (Boston), vol. 68, August 1933. 




[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


23 


measurement; a change from feet to inches or centimeters, or 
from pounds to ounces or grams, changes the residuals, but not 
the standardized residuals, nor x 2 - 

Exercise 2. When both x and y observations are subject to 
error, one may wish to designate the summation explicitly as 

X 2 = 4 E ( W xV X 2 + WyVy 2 ) 

(J 

as has already been indicated in Section 8. Show that this may be 
written 



which again says that x 2 is the sum of the squares of all the resid¬ 
uals, each one being measured in units of the standard error of the 
corresponding observation on the x or y coordinate. So x 2 is, 
as before, the sum of the squares of the standardized residuals. 
The remarks in the preceding exercise still hold. 

Exercise 3. S, or the sum of the weighted squares of the resid¬ 
uals, like x 2 , is also invariant to changes in units (as from pounds 
to ounces, etc.). But S is dependent on the arbitrary choice of <r, 
whereas x 2 is not. One weight in the whole set is arbitrary, and 
the others are related to it through Eq. 13; fixing this one weight 
is equivalent to fixing a. S can be doubled by doubling all the 
weights, but this has no effect on x 2 because a 2 would also be 
doubled. The least squares solution for a (and other parameters, 
if any, as in more complicated problems) is independent of a 2 ; the 
parameter or parameters that minimize S for one set of weights 
wi|J also minimize it if all the weights are doubled. 

For another interpretation of S in curve fitting, see Exercise 3 of 
Section 68, page 145, where S is seen to be equal to the sum of WFq 2 . 
Other exercises dealing with weights occur at the end of Chapter III. 

12. A more complicated problem — several samples, (a) All 

observations have the same precision. Let us suppose that n obser¬ 
vations of equal weight (equal precision), and all on the same 





24 


SOME SIMPLE ADJUSTMENTS 


[Sec. 12] 


unknown, as for example those of Section 9, are arbitrarily sub¬ 
divided into m samples of n\ y n 2 , • • •, n m observations. We shall 
say that 

X\ is the mean of n\ single observations. 

X U <i a << „ « U 

2 ^2 


X u u 
m 


n m 


u 


Now if single observations have unit weight, then it will follow 
from Eqs. 14 or 15 that the weights of the m means are 


wi = rii, w 2 = n 2 , • • •, w m = Um 

We may now consider the m sample means to be m observations of 
weights respectively n\ y • • •, n m , to which the results of Section 10 
apply. The value of a that minimizes x 2 is then 

v _ ^ _ n i^i ~j~ ^ 2 X 2 + • • • + n m X m 

tti + n 2 + • • • + n m 

which follows from Eq. 10. This value of X is the weighted mean of 
the m samples. Actually, it is also the mean of the entire group 
of n\ + n 2 + • * * + n m single observations, since they are all of 
the same weight (unity). Our result implies that when the 
residuals (Fj) are reckoned from this value of X, the sum S or 
2 wV 2 is a minimum. By Eq. 11, page 19, its value is 

S = £ m(Xi — X) 2 = £ mXi 2 - nX 2 

A schematic representation of the observations, residuals, and 
errors, and their relationships to the weighted mean, is shown in 
Fig. 6. 

a 

Exercise 1. Show that when the residuals {Vi) are measured 
from the value of X shown in Eq. 17, the weighted sum of the 
residuals is zero. That is, X) wV — 0. 

It is not to be inferred that2* wV = 0 in all least squares adjust¬ 
ments; see Remark 4 on page 182. 

Exercise 2. The value of X is independent of the mode of 
dividing up the n observations, that is, the subgroups of ni, n 2 f 




[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


25 


• • •, and n m observations can be formed from the n observations 
in any manner whatever. 



Fig. 6. Three series of observations on a magnitude n. 

n\ observations have mean X\ and standard deviation «i 
n 2 “ “ “ X 2 “ “ s 2 

n 3 “ “ “ Xz “ “ “ sa 

X is the weighted mean of the three series. The errors and residuals in the 
individual means X\ } X 2 , Xz are denoted by Ei, E 2f E 3 and Vi, V 2 , V 3 respec¬ 
tively. The error in the weighted mean X is denoted by E. As the figure 
happens to be drawn, E , E\ , Ez , Vi, and V 3 are positive, and E 2 and V 2 are 
negative, as the arrows indicate. This case of curve fitting is intermediate 
between the simplest problem shown on page 16 and more difficult ones de¬ 
scribed in Chapter VIII. 

Exercise 3. On the contrary, the value of S does depend on how 
the n observations are subdivided, and similarly for \ 2 . {Note: 
X 2 is just S divided by a 2 .) 

(b) The precisions of the single observations differ from one 
sample to another. Suppose that 

# X\ is the mean of n\ observations from a population of 
standard deviation <r\. Then the variance of X\ will 
be <ri 2 /ni. 


X m is the mean of rim observations from a population of 
standard deviation a m . Then the variance of X m is 
0w» 2 Abn* 



26 


SOME SIMPLE ADJUSTMENTS 


[Sec. 12] 


fit <* 2 ) * • *, (?m need not all be equal. For the weights of Xu X 2 , 
etc., we may take 


<r 2 tti<r 2 


<7\ 2 

<7 \ 

n\ 



n 2 (7 2 


<T 2 2 

ri 2 



n m (T 2 

(7m 

(7 m 2 

rim 



u tt tt ti 


(18) 


<r 2 is arbitrary, because the weights are purely relative. Again, 
as in Sections 9 and 10, the problem is to fit the curve 


x = a 


(4) 


The answer is already contained in Eq. 10 on p. 19, which applied 
to the present problem gives 


a 


£ w 


niXi 712X2 

9 . T" 9 . • 


<r\ 


<72 


+ ■ 


7lmX m 


= X 


- + 
o-i 2 a 2 


712 

2 


. 7l m 

1 2 

(7 m 


(19) 


The quantity X just defined is the weighted mean of the m sam¬ 
ples. Residuals (F») reckoned from it make S or £ w»(X< — X ) 2 
a minimum. 


The problem of part (6) reduces to that of part (a) if o\ = 
a 2 = .. . = ff m . It is interesting to see that <r does not appear 
in the fraction of Eq. 19, i.e., X is independent of <r. If a 2 were 
doubled, all weights would be doubled, but X would be un¬ 
altered. Likewise x 2 in Eq. 20 would be unaltered. (See the 
exercises in Sec. 11.) 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


27 


Exercise 4- Show that when the residuals are measured from X 
as defined in Eq. 19, 


£ wV = 0 


as was true also in part (a) of this section. (See Exercise 1, p. 24.) 
Note that x 2 can be written 


x 2 = £ 


(Discrepancy between Xi and X) 2 
Variance of X{ about true mean 


( 20 ) 


See Exercise 1 of Section 11 , page 22 . 

13. The estimates of < 7 , internal and external. Because of the 
distribution 6 of x 2 when the actual sampling (the experimental 
work) is described by the mathematical model here assumed, 
namely, normally distributed observations, the mean value of 
X 2 in the long run is equal to k, the number of independent residuals 
or “ degrees of freedom .’’ 6 In any one experiment, x 2 may he 
larger or smaller than the average. For the problems of parts (a) 
and ( b ), the number is m — 1 because there is one relation (Eqs. 17 
or 19) between the m residuals and X. The unbiased 7 estimate of 
a 2 made by external consistency 8 is found by calculating what value 
of a forces x 2 to take its mean value k. In other words, the esti- 

6 Karl Pearson, Phil. Mag., vol. 50, 1900: pp. 157-175. A paper dealing 
more specifically with curve fitting of the kind here considered will be found in 
the J . Amer. Stat. Assoc., vol. 29, 1934: pp. 372-382; see also Phil. Mag., 
vol. 19, 1935: pp. 389-402. 

6 The correction for the number of unknowns evaluated (one in this case) 
and the equivalent of setting the mean value of x 2 equal to S divided by the 
number of observed quantities diminished by the number of unknowns evalu¬ 
ated were set forth by Gauss in his Theoria Combinationis Observationum 
Erroribus Minimis Obnoxiae, Pars posterior (Gottingen, 1823; vol. 4 of his 
Werke), Art. 38. This correction is sometimes credited to Bessel, but the 
reference just given, which was kindly furnished by Dr. G. J. Lidstone, places 
the originality with Gauss. 

7 Unbiased in the sense that its mean value is a 2 . 

8 The terms external and internal consistency were introduced by Birge 
{Phys. Rev., vol. 40, 1932: pp. 207-227). The comparison of the two estimates 
(Sec. 13) is an application of the “ analysis of variance,” the essential features 
of which have long been recognized by physical scientists; see, for example, 
A. de Forest Palmer, Theory of Measurements (McGraw-Hill, 1912) pp. 66-71. 



28 


SOME SIMPLE ADJUSTMENTS 


[Sec. 13] 


mated a satisfies Eq. 3, page 15, whence comes the estimate 


a 2 (ext) = 7 
k 


( 21 ) 


From this equation, for the problem of Section 126, we get 

«*(«0 = Z ^4 (X< - X ) 2 (22) 

k m — 1 <jf 


This estimate is made from the external consistency of the data, 
i.e., from the fit of the “ curve ” X = a. What we do in making 
the estimate a (ext) is to say arbitrarily that x 2 does equal k. 
This is equivalent to saying that P(x) is about \ — not exactly \ 
because of the skewness of the x 2 distribution, which, however, 
gradually disappears with increasing k. 

If we are not positive that all m samples came from populations 
having coincident means, we should have as an alternate hypothesis 
that the m population means mi, M2, * • •, Mm are not all identical. 
Now if one or more of them really are not equal to the others, 
a 2 (ext) is raised, on the average, to some value higher than a 2 ; 
consequently in examining the hypothesis that Mi = M2 = * • • = Mm, 
we should be interested in knowing if a 2 (ext) is significantly greater 
than <r 2 , or, what is the same thing, if x 2 is significantly higher 
than k. This can be ascertained by looking up P(x) in tables of 
chi-square. Of course, x 2 can not be computed or compared with 
k unless a is known. Or, to use Fisher’s table of z, one would set 


2 = | In 


a 2 (ext) 


(23) 


and look up P(z) with Fisher’s n\ as m — 1 , and with n 2 equal to 
infinity, since a 2 is here assumed known. In regard to the inter¬ 
pretation of P(z), see the small type in the next section. Tables 
and examples in the use of P(x) and P(z) will be found in Fisher’s 
Statistical Methods for Research Workers (Oliver and Boyd); also 
in several other texts. 

Now the 

Wt. of Xsw x = z-^ 

0i 


(24) 




[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 
and the 


29 


(Bat'd S.E. of X) e J 


a 2 {ext) 
w x 


1 

{m - 1 )w x 


I>7 2 


1 

{m - 1 )w x 


S 


(25) 


There is also the estimate of a made from the internal consistency 9 
of the data, i.e., from the consistency of the observations within 
samples. This is 10 


cr 2 (int) 


ftjSi 2 + n2S2 2 + •_•• + n m s m 2 
ni + n 2 + • ■ • + n m - m 


(26) 


whence the 

(Est’d S.E. of X) int 2 = (27) 

w x 


wherein w x has the value given in Eq. 24. Si, s 2 , • • •, s m are the 
standard deviations of the m samples. 

The estimate by internal consistency is possible only if there 
are points in which there is more than one observation. When 
there is but one observation at each point, the estimate of cr by 
internal consistency is not a possibility. 

14. Comparison of the two estimates — analysis of variance. 

As was mentioned in the preceding section, the estimate a (ext) is 
valid only if the m populations have coincident means ,* if any two 
of the means m, jx 2 , • • •, y m are unequal, a 2 {ext) is, on the average, 
raised above a 2 . But, in contrast, the estimate a(int) is unaffected 
by.inequalities among the means of the populations; so long as a 
remains the constant standard deviation of all of them, the average 
value of <j 2 {int) is still cr 2 . It follows that a statistical test of the 
hypothesis mi = M 2 = • • • = is to examine the ratio of the two 

9 See the reference to Birge on page 27. 

10 See, for example, Eq. 67 in Deming and Birge's Statistical Theory of 
Errors (The Graduate School, The Department of Agriculture, Washington, 
1934, 1938), p. 158. 



30 


SOME SIMPLE ADJUSTMENTS 


[Sec. 15] 


estimates. To do this, we may follow Fisher and take 


z = ^ In 


a 2 (ext) 
a 2 (int) 


(28) 


and look in his tables to see if z is “ significantly ” different from 0. 
(In doing this, we use m — 1 for Fisher’s fti, and n\ + n 2 + • • • + 
n m — m for his n 2 .) If z is found to be so large that it lies beyond 
the 1 percent limit, we say there is “ statistical evidence ” that 
the data are not homogeneous, or that not all the m are equal; 
in other words, that the curve 


Xi = a 


(29) 


is not a good fit 

Remark. Such a calculation of “ significance ” takes account 
only of the numerical data of this one experiment. An estimate 
of a is not to be regarded as a number that can be used in place 
of <r unless the observations have demonstrated randomness 
(Ch. I), and not unless the number of degrees of freedom (the 
denominator in Eqs. 21 or 26) amounts to 15 or 20, and pref¬ 
erably more. A broad background of experience is necessary 
before one can say whether his experiment is carried out by 
demonstrably random methods. Moreover, even in the state 
of randomness, it must be borne in mind that unless the number 
of degrees of freedom is very large, a new experiment will give 
new values of both a (ext) and a (int), also of P(x) and P(z). 
Ordinarily, there will be a series of experiments, and a cor¬ 
responding series of P values. It is the consistency of the P 
values of the series, under a wide variety of conditions, and not 
the smallness of any one P value by itself that determines a 
basis for action, particularly when we are dealing with a cause 
system underlying a scientific law (Ch. I). In the absence of a 
large number of experiments, related knowledge of the subject 
and scientific judgment must be relied on to a great extent in * 
framing a course of action. Statistical “ significance ” by 
itself is not a rational basis for action. 

16. Another simple problem — the slope of a line that is known 
to pass through the origin, (a) The y coordinates subject to error; 
x free of error. The equation to be fitted to the points in Fig. 7 is 

y - hx (30) 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


31 


Let yi denote the observed ordinate at the ith point; then yi — bxi 
is the residual at that point. It is a vertical, or y residual, because 
the error is all in y, by assumption. If W{ denotes the weight of 
then the sum 


S = 'Lw i (y i -bx i ) 2 (31) 

is to be minimized. We differ¬ 
entiate this with respect to b 
and obtain 

dS 

-^r=—2 T,WiXi(y~bXi) (32) 
ab 


Set equal to zero, this gives 
b £ wx 2 = 22 wxy (33) 


whence 


h - ^ wx y 

22 wx 2 


(34) 



the origin. The slope is to be esti¬ 
mated from the observed points. 


The subscripts are omitted for convenience, w means the weight 
of a y observation, as before. 


Note that here, neither T! res nor"£ w • res is necessarily zero, but 
that 22 w * x - res = 0. The student should demonstrate these 
statements. (Cf. Remark 4, p. 182, for an extension of this note.) 


Special cases, i. Suppose that the weight of y is inversely 
proportional to x ; i.e., the square of the standard error of y is 
proportional to x. Then Eq. 34 gives 

6 - |f (35) 

• Here, res = 0, as the student should prove. 

The result obtained in Eq. 35 has application in many prob¬ 
lems in the social sciences. Sample surveys of (e.g.) vacancy are 
often taken in a city or metropolitan district, by picking out 
certain blocks, or segments of blocks, and noting at every 
dwelling unit therein (or sometimes at every £th dwelling unit) 
whether that dwelling unit is vacant or occupied. If in Block i, 
or Segment i, it is found that there are dwelling units, of 
which yi are vacant, then when the survey is completed, the 



32 


SOME SIMPLE ADJUSTMENTS 


[Sec. 15] 


estimated vacancy rate (fraction vacant) for the entire city 
may be taken as 


b = 


Ej/i 

E *i 


Total number of vacant dwelling units in the sample 
Total number of dwelling units in the sample 


(35') 


This estimate will be close enough for purposes of action, if the 
sample is not too small. Often a 5 or 10 percent sample of all 
the dwelling units in a city or metropolitan district is sufficient. 

The justification for using Eq. 35 to obtain an estimate of the 
vacancy rate lies in the observation * that, except when the 
vacancy rate is inordinately high, the vacant dwelling units are 
usually scattered throughout the city at random. (This obser¬ 
vation was first made by Messrs. J. Stevens Stock and Les¬ 
ter R. Frankel of Washington, in their sample surveys of rent 
and housing.) 11 

Another application of Eq. 35 is to the hatchability of eggs: 
the more eggs set, the more hatch y (except for random scattered 
infertility), but also the greater the error in y, in absolute 
numbers. 

Eq. 35 is used in Example IV at the end of the book, for a 
sample inventory of canned goods. 


ii. Suppose that all the y weights are equal; i.e., all y obser¬ 
vations have the same standard error. Then Eq. 34 gives 


b = 


Esy 

E * 2 


(36) 


This is perhaps a more usual case than the preceding one, par¬ 
ticularly in engineering, physics, and chemistry. 

iii. Suppose that the weight of an observation on y is inversely 
proportional to x 2 . By putting w = l/z 2 , we find from Eq. 34 
that 

» = if = ^ = average^ (37) 

The letter n here stands for the number of points. Each ob¬ 
served point gives an observed slope y/x, and the least squares 
estimate obtained from all the points is in this case simply the 
average of all n observed slopes. 

11 Private communication to the author. 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 


33 


The distinction between Eqs. 35, 36, and 37 should be noted 
carefully. In Eq. 35 a point has more influence on b if it is far 
outlying, this influence being closely proportional to the distance 
of the point from the origin. In Eq. 36 the influence of a point 
is further accentuated by its distance from the origin. In Eq. 37 
the advantage of distance is completely removed, the final result 
being merely the average slope of the n rays joining the origin 
with the observed points. 

(6) The tabular solution of b, and its weight , and the sum S. 
This will be similar to the tabulation in Section 106 (q.v.). We 
enter in Row 1, from Eq. 33, the coefficient of 6 under 6, and the 
right-hand member under the “ 1.” Enter the weighted sum of 
squares of y in Row 2. Fill in the C column with 1 and 0 as shown. 


Row 

b = 1 

c 


I 

2 wx<1 2 wx v 

i 


2 

z w 

0 

How obtained (cf. Sec. 106) 

Q 

(2 wx v) 2 


(By multiplying Row I through 

o 

2 wx 2 


by — 2 u> x yl 2 wx2 ) 

II 

^ , (2 wxy) 2 

]L m 2 - ^ 2 

2 wx 


(By adding Rows 2 and 3) 


An ellipsis (• ■ •) in the tabular array denotes a space wherein a 
number would ordinarily be entered in numerical calculation, but in 
which it is not worth while to show the entry in symbols. 

Row I solved with the “ 1 ” column gives 6 as in Eq. 34. 

Row I solved with the C column gives 6 = 1/2 w % 2 ) which means 
th&t 

The weight of 6 = = 2 wkc 2 (38) 

Row II shows the minimized value of S in the “ 1 ” column, 
which is to say that 

„ ^ t T- 2 ( T.V)Xy ) 2 

s or 2- u>(y - bxy = 2- wy -^- 2 ~ 

2 _ wx 


(39) 



34 


SOME SIMPLE ADJUSTMENTS 


[Sec. 16 ] 


Thus S is calculated in the tabular solution without the necessity 
of first solving for b and the individual residuals. The initial sum 
of squares, £ wy 2 in Row 2, is seen to be reduced by the amount 
(£ wzt/) 2 /£ wx 2 , the residuals being finally measured from the 
fitted fine, instead of the x axis. 

Here the external estimate of a will be found by writing 

<r 2 (ex«) = —(Cf. Sec. 13.) (40) 

771—1 


whence the 

(Ert'd S.E. of b)„, 2 - 


s 

(771 - 1) L WX 2 


(41) 


16. The t test for the slope. In order to apply the Student t test 
to see if there is statistical evidence that the calculated value of 
b is “ significantly different” from some theoretical value, say B , 
we should write 


\B-b\ 
Est’d S.E. of b 


(42) 


and make the Student t test, using Fisher's n equal to our m — 1. 
The region of rejection in the t distribution is to be chosen with 
due regard to admissible alternative slopes, which may be greater 
or less than B. In the denominator of Eq. 42 we may use the 
estimate made by external consistency, or that made by internal 
consistency (Sec. 13). If a (int) were used in place of a (ext) in 
Eq. 41, then we should have 


(Est’d S.E. of b) int 2 = 


<j 2 (int) 

Wb 


a 2 (int) 

2 wz 2 


(43) 


This would replace the denominator of Eq. 42, and the number of 
degrees of freedom (Fisher’s n) would be the total number of 
observations diminished by the number m. 

With regard to the interpretation of statistical tests, see the 
remark at the end of Section 14, page 30. 



[Ch. II] SIMPLE ILLUSTRATIONS OF CURVE FITTING 35 

17. The x coordinates subject to error, y free of error. will 
now denote the weight of x t *. In place of Eq. 31 we now have 

S-Zwi (xi — — 'j (44) 

since here the y residuals are zero, and S is made up by squaring 
the x residuals. By differentiation 

db = + V 2 s w,Vi (** ~ I) (45) 

Set equal to zero this gives 

b 2) wxy = £ W V 2 ) or 5 = ^ (46) 

2 - wx y 

Note the distinction between Eqs. 34 and 46. w in Eq. 46 is the 
weight of x , not y. 

For another derivation of Eq. 46, see Hint 1 in Exercise 12, 
following. 


Exercises 

Exercise 1. If b in Eq. 46 be distinguished as 6 ; , prove that 
between Eqs. 34 and 46 there exists the relation 

b £ w yXy £ w x xy 
b' Z WyX 2 £ u>xy 2 

Exercise 2. Find the curve y = bx, also from the following 
data. 


X 

VoSs 

Wy 

1 

0.52 

1 

2 

0.96 

1 

3 

1.50 

2 

5 

2.65 

1 


y alone is subject to error. Use the tabular arrangement on 
page 33. 



36 


SOME SIMPLE ADJUSTMENTS 


{Sec. 17] 


Exercise 8. Find the curve y = b'x, also wv* 


Xobs 

y 

w x 

1.05 

0.5 

1 

1.90 

1.0 

1 

3.08 

1.5 

2 

4.86 

2.5 

1 


x alone is subject to error. Again use the tabular arrangement, 
but be careful. 



Part B 


THE LEAST SQUARES SOLUTION OF 
MORE COMPLICATED PROBLEMS 

CHAPTER III 

THE PROPAGATION OF ERROR 

18. Small errors in functions of one variable. If /(x) is a func¬ 
tion of x, the linear term in Taylor’s series can often be used to 
express with sufficient accuracy the effect on fix) of a small error 
in x . Thus, if Ax is the error in x, and A/ the resulting error in 
fix), Af and Ax may be closely enough related by the equation 

Af = f'(x)Ax (1) 

This is the equation for the 'propagation of error in a function of a 
single variable. fix) is the first derivative of /(x), evaluated at 
the point x, /(x). In practice, the true value of x is not known, 
but it is usually sufficient to evaluate/'(x) at a near-by point, such 
as a point whose coordinates are determined experimentally. 
fix) remains constant while Ax and Af vary. 

The above equation says that the error in/(x) will be propor¬ 
tional to the error in x. The derivative fix) is the factor of 
proportionality. The equation is not exact, i.e., the error in / is 
not •strictly proportional to the error in x, except when/(x) is a 
linear function of x. It is close enough for actual use provided 
the error Ax is small enough, or when the higher derivatives of 
fix ) are small enough. F ortu nately, in practice much experimental 
work, and most of the functions used, satisfy these requirements. 

The relation between the error in /(x), and the approximation 
afforded by Eq. 1, is shown in Fig. 8. 

37 



38 


MORE COMPLICATED PROBLEMS 


[Sec. 19] 


In a linear function, such as 

f(x) = a + bx (2) 

the higher derivatives f"(x),}"' (x), etc., are zero absolutely, and 
Eq. 1 reduces to 

Af = b Ax (3) 

which is exact for any error in x, however large. The error in 
f(x) will now be exactly proportional to the error in x. 


iu) 



Fig. 8. Showing the relation between the errors in x and f(x), and the ap¬ 
proximation contained in the equation 

A/ = f'(x) Ax 

Ax is the error in x, Af the error in the function f{x). The approximation in 
Eq. 1 is made by using the tangent to the curve in place of the curve itself. 

19. Small errors in functions of several variables. Taylor's 
series can be extended to obtain expressions for small errors in 
functions of several variables. Thus, if F is a function of x } y, z, 
and if they are in error by the amounts Ax, Ay, Az, then F will be 
in error by some amount AF, which can be expressed closely 
enough as 

AF = F x Ax + F y Ay + F 9 Az (4) 

provided the errors Ax, Ay, and Az are small enough, or when the 
higher derivatives are small enough. Here F x , F v , F z denote the 


ICh. Ill] 

derivatives 


THE PROPAGATION OF ERROR 


39 


F x = 


8F 
dx ’ 



F z 


dF 

dz 


(5) 


which in practice are to be evaluated at the point x , y , z, or as 
near this point as is experimentally possible. 

Eq. 4 is the formula for the propagation of error in three variables. 
It can be extended to more variables simply by adding more terms 
of the same kind. 


Eq. 4 is written only through the first powers of Ax, Ay, and 
Az, because the rest of the Taylor series (involving the squares 
and higher powers and cross-products of Ax, Ay, and Az) will be 
negligible if Ax, Ay, and Az are not too large, or if the higher 
derivatives are small enough. In practice, the possible errors 
Ax, Ay, and Az are limited in magnitude, and Eq. 4 is usually a 
good enough approximation for ordinary situations. In the 
event that F is linear in x, y, and z (as in Exercise 2 at the end 
of the chapter), there are no terms at all except the linear terms 
(i.e., there are no neglected terms), and Ax, Ay, and Az may 
then be ever so large without invalidating Eqs. 4, 7, 8, and 9. 
(Compare with the explanation in the preceding section for a 
function of a single variable, particularly the text accompanying 
Eqs. 2 and 3.) 


20. The propagation of mean square error or variance.. Eq. 4 

leads to a relation between the mean square errors or the variances 
of x, y , z , and F, and hence also a relation between their standard 
errors and their weights. If we square each side of Eq. 4 we get 

A F 2 = ( F x Ax) 2 + (F y Ay) 2 + (F z Az) 2 + 2 F x F y Ax Ay 

+ 2 F X F Z Ax Az + 2F y F z Ay Az + • • • (6) 

Now let Ax, Ay, and Az take on all possible values 1 within their 
allowable ranges of variation. The derivatives F X) Fy, F z , being 

1 In practice it may safely be assumed that the ranges of variation in Ax, 
Ay, and Az are not large, wherefore the constancy of F x , F Vf F t is usually not 
a difficulty. It is moreover presumed that the standard errors <r x , <r v , and^a, 
actually do exist, as they always do in practice. (There are theoretical and 
attainable distributions of errors, in which the standard deviation is infinite. 
An example is the Cauchy distribution y = l/7r(l -f- x 2 ), the like of which 
must be excepted.) 



40 


MORE COMPLICATED PROBLEMS 


evaluated at x, y, z, are constants while Ax, Ay, and Az vary. 
Then let each term in Eq. 6 be replaced by its average value; the 
result is 


°p 2 = (F x o x ) 2 + (F y<r y ) 2 + (F,<r t ) 2 

+ 2(F X Fy^x^y'f'xy 4“ FxFz^x^z^xz 4" FyFzGyPz?yz) (7) 

where a 2 = variance of x, r xy the correlation between Ax and Ay, 
etc. This formula (also the simplified form in Eq. 8 when it 
applies) is called the 'propagation of mean sqitare error , or the 
propagation of variance. 

The terms in parenthesis are zero if the errors in x, y, and z are 
independent, i.e., uncorrelated. In such a situation Eq. 7 reduces 
to 

cr/ = (F x a x ) 2 + C Fy<r y ) 2 4- (F z a,) 2 (8) 


or, by Eq. 13 of Chapter II, 

J_ _ F X F X F y F v F Z F Z 

Wf W X Wy W z 


(9) 


This equation could be called the propagation of weight, if it 
needed a name. It will be seen to be of great importance in the 
solution of the general problem in least squares. One may refer to 
Exercises 11 and 12 at the end of this chapter; also Eq. 14 of Ch. IV, 
p. 55; Remark 1 of Sec. 28, p. 56; Eq. 8 of Ch. VIII, p. 134; Exer¬ 
cises 3 and 4 of Sec. 58, on page 145; and Remark 3 in Exercise 4 
of Ch. X, p. 181. 


21. The standard error of a mean. It is interesting to see that 
if F be taken as the mean ( x) of the n independent observations x iy 
% 2 y • • •, x n each of standard error a, then Eq. 8 leads to the well- 
known expression 

_2 

a- 2 = — (Cf. Eq. 14 on p. 21.) 
n 


as was taken for granted on page 21. This, however, does not 
tell us that if the individual observations are normally distributed, 
the mean x is also — this fact must be obtained otherwise. Eqs. 7, 
8, and 9 are in fact independent of any assumption concerning the 
distributions of the errors in x, y, z f and F, provided the standard 



[Ch. Ill] 


THE PROPAGATION OF ERROR 


41 


errors a X) etc., actually exist, as was stipulated in the footnote on 
page 39. 

22. A numerical example of small errors. To see how the 
Taylor series operates, we may try it with the particular function 

F = 4x 2 + sin xy + z (10) 

where the product xy is in radians. Let us evaluate F at the point 

x — 2 

y = 0.95 

z = 10 

We find 

F 0 = 4 X 2 2 + sin 1.9 + 10 

= 16 + 0.9463 + 10 = 26.9463 (11) 

Now let x decrease by the amount 0.1, y increase by 0.05, and z 
increase by 0.2. These increments may be considered as small 
errors, and we wish to see what effect they have on F. The new 
values of x> y, and z are 1.9, 1.0, and 10.2, and the new value of F is 

Fx = 4 X 1.9 2 + sin 1.9 + 10.2 

= 25.5863 (12) 

The change in F is 

AF — F\ — Fo ~ 25.5863 - 26.9463 = -1.3600 (13) 

To compare this (exact) value of AF with the approximation 
afforded by Eq. 4, we first take the derivatives, 

F x = Sx + y cos xy 1 

F y = x cos xy f (14) 

F s = 1 J 

and then evaluate them at the point x = 2, y = 0.95, z = 10. 
They turn out to be numerically 

F x = 15.6929 
F v = -0.6466 

Ft = 1 


(15) 



42 


MORE COMPLICATED PROBLEMS 


[Sec. 22] 


whence by Eq. 4, we calculate the approximation 

A F = -15.6929 X 0.1 - 0.6466 X 0.05 + 1 X 0.2 

= -1.4016 (16) 

This is to be compared with the exact value of A F, computed in 
Eq. 13. Other functions, and other values of Ax, Ay, Az, would 
give different degrees of approximation. In the development of 
the general problem in least squares, we shall be compelled to 
accept the approximations afforded by Eqs. 4, 7, 8, and 9. For¬ 
tunately, for purposes of action, the results are usually close 
enough. 


Exercises 

In the following exercises, independence of the observations is 
assumed, as in Eq. 8. 

Exercise 1. (a) The mean square error of the sum or difference 

of two numbers having equal precisions is twice the mean square 
error of either alone (assumed independent). 

(6) The root mean square error (standard error) of a sum or 
difference of two numbers having equal precisions is y/2 times 
the standard error of either alone. 

(c) The root mean square error of the sum of n observations 
of equal standard error, a, is ay/n. 

(i d ) A surveying party chains a distance of L feet. Show that 
the standard error of the measurement is proportional to \/L. 

Exercise 2. (a) If u is a linear function of the independent 

variables x, y, and z, say 

u = ax + by 4- cz 

then the root mean square errors are related by the equation ‘ 

<Ju = a 2 <r x 2 + b 2 (T y 2 + c 2 a 2 

A special case is contained in Exercise 1 a. 

(6) If F is a linear function of the n independent variates 
X\, X 2 , •••,£« of the form 

F — a\X\ + ( 12 X 2 + • • • + a n x n 



[Ch. Ill] 


THE PROPAGATION OF ERROR 


43 


and if the variates x\, x 2 , • • •, x n are distributed with variances 
<n 2 > <^ 2 2 , * • •, <7n 2 , then F is distributed with variance 

<rp 2 — a^\ 2 r i 2 -f- a 2 u 2 + Q'zfoz 2 u n 2 <7 n 2 

Exercise 8 . (a) If u = axyz , then 






which interpreted says that the squares of the percentage or relative 

mean square errors are additive (or the squares of the coefficients 

of variations are additive). 

(b) The result of part (a) can be written 

2 2 1 2 1 2 

rjj — (?x ~r Qy 1 oz 


where U — log fc u, X = log fc x, etc., k being any base whatever. 
Exercise 4 . If u = axy/vz, then 

The squares of the percentage errors are again additive. 

Exercise 5. If u = ax a y 0 z y , then 

fey-^y+ftw?y 

Here the percentage errors are increased by the factors a, p, 7 . 
(Exercises 3 and 4 are special cases of this.) 

Exercise 6. For the conditions of Exercises 4 and 5 the relations 
between the weights are respectively 

_L_1111 

u 2 w u x 2 w x y 2 w v v 2 w v z 2 w 9 

and 

u 2 w u x 2 w x y 2 w v z 2 w z 



44 


MORE COMPLICATED PROBLEMS 


[Sec. 22] 


Exercise 7. (a) If A = irr 2 (A the area and r the radius, d the 

diameter, of a circle), then if an error Ar be committed in measur¬ 
ing r, or Ad in d , the corresponding error in the area is closely 

, 2 A 

AA = 2tt r Ar — — Ar 
r 

whence 

A A A r A d 

2-^2 — 

A r d 

An error of 1 percent in either the radius or the diameter thus 
means about 2 percent error in the area. Also 

~ = 2 — and A 2 w a = \r 2 w r = \d 2 w d 
A r 

(This is a special case of Exercise 5.) 

(6) The measurements on the sides of a rectangle, a and b, are 
subject to the errors Aa and A6. Show that these errors are re¬ 
lated to the area A by the equation 

AA _ ^ (The percentage errors 

A a b are thus additive.) 

(c) The same equation is satisfied by the area of an ellipse, a 
and b being the axes or semi-axes. 

Exercise 8 . (a) If y is in error by the amount by, then In y is in 

error by approximately by/y, if by is not too big. 2 

(6) If logarithms to the base 10 are used, the error in log y is 
approximately 0.434 by/y. 

(c) In particular, let y change from 15 to 16, as in Fig. 9. Then 
calculate the increment in log y by the approximate formula just 
derived, and compare it with the exact value of the increment. In 
other words, carry out the calculations mentioned in the legend of 
Fig. 9. 

2 The abbreviation In is used here for “ logarithme naturel,” as is common 
in Europe, and among chemists everywhere. The abbreviation log will be 
used for a logarithm to base 10. 



[Ch. Ill] 


THE PROPAGATION OF ERROR 


45 


(i d) Let Y = In y; then 


and 




Wy = y 2 Wy 


log y 


J log y • 0.02803 


1.20412 4- 16.0 


1.17609 4- 15.0 


i y • 1.0 


Fig. 9. Illustrating the relation between an error in y and an error in 
log y. Here y changes by unity, and log y changes by 0.02803. The ap¬ 
proximate relation 8 log y = 0.434 8y/y gives 0.02895, which is to be compared 
with the exact value 0.02803. Smaller changes (smaller values of 8y) show 
better agreement, but even for this rather large value of 8y the approximate 
relation would be adequate for many purposes. (See Exercise 8c, p. 44.) 

(This result is important; see Exercise 18 in Ch. X, p. 201.) 

(e) If Y = log y } then 



and 

— = 0.434 2 - A- 
w Y y z w y 

Exercise 9. Let u = ae 6 *, then 

<7 u 2 = b 2 u 2 a x 2 or <Ty 2 = 6 2 <r x 2 

where 

U = In u 



46 


MORE COMPLICATED PROBLEMS 


[Sec. 22] 


Exercise 10 . The period of a simple pendulum is T = 2w\/(L/g ). 
Show that if the length L is too long by one-tenth of a percent, the 
clock will lose about 44 seconds per day. 

Exercise 11. (a) Prove that if F is a function of x, and x a func¬ 
tion of t, then 

F X F X = FjFt 
w x w t 


where F x denotes dF/dx, and F t denotes dF/dt. 
(b) Prove from Eq. 9 that 


w x = 1 

(I) 1 - 

Var x = 

/ dx\ 2 

\dt) ^ al * * ^ ai * ^ enotes var i ance *) 

(T x = 

dx 1 (a x denotes the standard error 

| dt | 11 of x ; a t the standard error of t .) 

Exercise 12. 

Prove that when the line 


y = b'x 

is fitted to points for which x is subject to error and y free of error 
(as in Sec. 17), it turns out that the weight of b ' is 


1 _ 1 _ 2 

w = £73 2- w * x v = w *y 


Hint 1: From Section 15o, wherein y was subject to error, and 
x free of error, we had 

6 -§^r> 

m = Z w v x2 (Eq. 38, p. 33) 

Now look at Fig. 7 on page 31. Viewed from the back it will 
appear like Fig. 10, and the equation of the line will be 

x = -J> y (y now free of error) 



[Ch. Ill] 


THE PROPAGATION OF ERROR 


47 


From the equations just written for b and its weight, we may 
now interchange x and y in the formula for b and write 


J| H W&y 
L w x y 2 

m = £ w x y 2 
P 


(Cf. Eq. 46, p. 35.) 


whence by the result of the preceding exercise, part (6), 



= jTi 2 w x y 2 = ^Z w&y Q.E.D. 


X 



Fig. 7 when viewed from the back. 


6' = 


Hint 2: (Due to my colleague Morris H. Hansen.) Write 
T,w x y 2 


Wxixyyi + w x2 x 2 y 2 + • • • + w xn x n y n 


(Cf. Eq. 46, p. 35.) 


db' = , w x jyj 

dxi ‘ £ w xiXxVi 

Noto make use of Eq. 8 on page 40 and get 

s)’s; 

—« ED - 



48 


MORE COMPLICATED PROBLEMS 


[Sec. 22] 


Exercise 13. (a) Recompute the approximation to A F in 

Section 22 by using the derivatives F x , F y , F z evaluated at x + 
Ax = 1.9, y + Ay — 1.0, z + Az = 10.2, instead of at the initial 
values of x, y, z . ( Answer : A F = —1.3184. Note that the 

result is very close to that shown by Eq. 16.) 

( b ) Show that when the derivative in Eq. 1 is evaluated at 
x + Ax, instead of at x, the result for A/ differs only in squares 
and higher powers of Ax from the value obtained for Af by eval¬ 
uating the derivative at x. 

(c) Prove a similar statement for Eq. 4 when the derivatives are 
evaluated at x + Ax, y + A y, z + Az. (In practice we are 
obliged to evaluate the derivatives at the observed points, not the 
adjusted points. Fortunately the distinction in the results is 
usually negligible.) 

Exercise l^. If V{ is the zth residual from the line in Sec. 15, 
page 30, that is, if 


Then, since 
it follows that 


Vi = yi - bxi 
b = rcy/cr x 


Vi = yi - r((j y /cr x )xi 


Show that the variance of V is 


Var V = cr 2 (ext) = ay 2 ( 1 - r 2 ) 

Hint: Since x and y are correlated, use Eq. 7 and find that 

Var V = <7y 2 + {—ra v /(r x ) 2 (j x 2 + 2(—ra v /(r x )ra x a y 
= <r„ 2 U ~ r 2 ) 

See Exercise 36 on page 177, where £ V% 2 is given explicitly. 



CHAPTER IV 


THE GENERAL PROBLEM IN LEAST SQUARES 

23. Outline of the problem. 1 As a result of any experiment or 
sample survey there will be observations, and when the adjustment 
is completed, to each observed value there will be a corresponding 
adjusted value. It is useful to introduce the concept of a true 
value, which is merely the average value that would result from 
repeating the experiment a large number of times in a state of 
randomness. In curve fitting, one can visualize the relation be¬ 
tween the observed, adjusted, and true coordinates, and it may be 
helpful to the reader to turn forward at this time to Figs. 16 and 17 
on pages 132 and 133. 

In formulating the general problem we shall deal with the quan¬ 
tities listed in the table below — 


Observed quantities: 

*i, 

X 2t •• 

•, Xnj 

Y h 

y 2 , 

•, Y n 

Their adjusted (or 
calculated) values: 


X 2 , 

*> 

Vh 

2/2, 

Vn 

Their weights: 

w x i, 

W x2 , • ' 

' ’» Wxnj 

w vh 

Wy2, * ' 


Their true values: 

£i> 

$2, 

; 

Vh 

V2, 

*» Vn 

The residuals (obs’d - 
calc’d): 

v xh 

v x2 , 

•, v m ; 

Vyl, 

Vy2, *• 

•, V yn 


In geometrical problems, and other problems not involving 
parameters, the observations need not be considered as coordi¬ 
nates of observed points. 

The assumption will be made here that there is no correlation 
between the errors in the observations. This assumption covers 
a wide class of problems, but does fail to cover some. 

1 The development from here on is an amplification of three papers that 
appeared in the Phil. Mag. The references are vol. 11, 1931: pp. 146-158; 
vol. 17, 1934: pp. 804-829; vol. 19, 1935: pp. 389-402. 

49 



50 


MORE COMPLICATED PROBLEMS 


[Sec. 24] 


The residuals (7) are defined by equations typified by 

Vxi = Xi - Xi\ (Res = obs , d _ calc , d) (1) 
V yi — * i Vi) 

It is the residuals that are actually calculated first, and in actual 
use, these equations are therefore reversed. Once the residuals 
are found, the adjusted quantities are calculated by subtracting 
each residual in turn from the corresponding observation, ac¬ 
cording to Eqs. 6 ahead. 

24. The conditions. The principle of least squares requires 
that the sum of the weighted squares of the residuals , 2 

S = • res 2 (2) 

shall be made a minimum with respect to the adjusted values 
X\, X2, • • *, x n , 2 / 1 , 2/2, • * *, 2 /n- But this is not a simple problem in 
the maximum and minimum of functions, for here the adjusted 
values are related to one another. For example, in the case of 
measurements on the three angles of a plane triangle, we required 
that x\ + X 2 + xs = 180° (see p. 7). In curve fitting, the prob¬ 
lem is further complicated by the fact that the conditions on the 
adjusted values (x») involve the estimates a, b, c of the unknown 
parameters a, /3, 7 . In the problem of Section 10, for instance, 
the adjusted values of the x coordinates of the n points were all 
required to be equal to a, which was then evaluated as x (Fig. 5) 
to make the sum of the squares of the residuals a minimum. 

So to take care of the general case we shall suppose that the 
adjusted values Xi and 2 /» are subject to v conditions, to be symbol¬ 
ized as 

F l (x u x 2 , ■ ■ ■, y n ; a,b,c) = 0' 

F 2 ( “ ) = 0 

v equations 

* for (3) 

v conditions 

F*( 11 ) = 0 

2 The sign will denote summation over all observations, x and y both, if 
both are observed. 



[Ch. IV] THE GENERAL PROBLEM IN LEAST SQUARES 51 


The superscript on each F distinguishes that condition from an¬ 
other. Different sorts of problems are characterized by the dif¬ 
ferent kinds of conditions that the adjusted quantities X{, y *, and 
a, b, c are subjected to. From the theoretical standpoint, the 
different problems are all conveniently handled alike. This is 
possible because there is only one principle of least squares, namely, 
the minimizing of x 2 - 

Eqs. 3 will be referred to as the conditions , or the condition 
equations. The functions F 1 , F 2 , etc., on the left, are the condi¬ 
tion functions. They must be so chosen that when equated to zero 
they force the conditions that are to be imposed on the adjusted 
coordinates, angles, lengths, etc. 

The assumption behind this development is that the conditions 
would all be satisfied exactly by the true (unknown) quantities 
being measured, and the true parameters a, 0 , 7 , all of which, 
theoretically at least, could be had closely enough by increasing 
the number of experiments. 

26. Notation for the derivatives. The derivatives of the con¬ 
dition functions will be denoted by subscripts, as in Eq. 5 of 
Chapter III (p. 39). Specifically, the notation will be as follows: 



dF h 

dxi 

dF h 

da 



dF h 
tyi ’ 
dF h 
~db' 


etc. 

etc. 


(4) 


Denoting differentiations by subscripts is very convenient in some 
work, as it is here. It is a common practice among mathematicians. 

The subscript 0 in Eq. 5 below does not denote differentiation, but 
an approximation to the condition function F h . 


These derivatives, like the condition functions themselves, are 
functions of x\, x 2 , • • *, y n , a, c. In what follows, we shall need 
numerical values of these derivatives, and fortunately, for most 
purposes, it will suffice to evaluate them with the observed quan¬ 
tities X\y X 2 , • • *, Y n , and with the best available approximations 
a 0 , bo, Co obtainable for the parameters, (cf. Ch. Ill; in particular, 
Exercise 13). In other words, F x \ h is to be a number representing 



52 MORE COMPLICATED PROBLEMS [Sec. 26] 

our best guess 3 at the numerical value of this derivative, and 
similarly for the other derivatives. 

We then write 

F 0 h = F^Xt, X 2 , ■ ■ ; Yn, ao, bo, Co) h= (5) 

v equations 

Fo 1 is a small number, being just the amount by which the con¬ 
dition F 1 = 0 fails to be satisfied by the observed values X it 
X 2 , • • •, and the approximations a 0 , b 0} Co. Similar statements 
hold for F 0 2 , F 0 3 , • • •, F 0 \ 

As stated earlier, a 0 , bo, Co are approximate values of a, 6, c. 

They can usually be arrived at somehow, as by forcing three of 
the conditions, i.e., solving for the values of a, b , and c that make 
three of the condition functions vanish. This is the so-called 
method of selected points, concerning which more is said in the 
reduced type at the end of Section 55 (p. 138). Each F 0 would 
be exactly zero except for errors of observation and the conse¬ 
quent impossibility of choosing a 0 , b 0 , c 0 to satisfy all the con¬ 
ditions simultaneously. 

26. The reduced conditions. Now let the conditions be made 
linear in the residuals V x \, V x2 , • • *, V yn , A, B, C, by expanding 
Eqs. 3 by Taylor’s series, retaining only the first powers of the 
residuals, 4 and remembering that 

= Xi - V xi y 

Vi = Yi - Vyi 

a = ao — A [ (Calc’d = obs’d — residual) (6) 
b - b 0 - B 
c — Co — C 

3 If our best guess is too far wrong, a second adjustment will be required, 
but this rarely happens in practice. See the quotation from Gauss on page 180. 

4 The problem of a straight line with no error at all in one of the coordinates 
(Exercises 1 and 7 in Sec. 65) is one in which there are no squares and higher 
powers of the residuals to neglect, hence no discrepancies of the kind men¬ 
tioned (cf. Eq. 3 of Ch. III). The simple example of the triangle in Chap¬ 
ters I and V is another. On rare occasions the residuals may be so large that 
the neglected terms invalidate the reduced conditions (Eqs. 7), in which event, 
in general, no systematic solution is available. An exception is the straight 
line under certain circumstances of weighting; see Exercise 6 of Section 65. 



[Ch. IV] THE GENERAL PROBLEM IN LEAST SQUARES 53 


When this is done, the conditions originally expressed by Eqs. 3 
take the form 

E Fxi h Vxi + E F yi h V yi + Fa A + F b h B 

X V 

+ F c h C = F 0 h , h = 1, 2, • • v 

v equations (7) 

These are called the reduced conditions. They are equivalent to 
Eqs. 3, except for small discrepancies arising from the neglect of 
higher powers of the residuals in the expansion. 

27. The method of Lagrange multipliers. 5 Now if S is at its 
minimum value, and if any or all of the residuals then undergo 
small variations (expressed by 5), the variation in S will be zero 
to within higher powers of the variations in the residuals; in other 
words 

%8S = 'E'WVdV = 0 , one equation ( 8 ) 

The variations typified by 8V are not arbitrary, but must always 

permit the residuals to satisfy the condition Eqs. 3, or their 
equivalent, Eqs. 7. So by differentiating Eqs. 7 we find that 

F X i h 5Vxi + £ Fyi h bV y i + F a h 8A + Fb h 8B 
+ F e h 8C = 0, h = 1, 2, • • -, ? 

v equations (9) 

Now multiply Eq. 9 through by —A*, an arbitrary multiplier, to 
get 

-X a (E FjiV xi + E F yx HV yi ) - \ h F a h SA - \ h F b h &B 
- \ h F c h 6C = 0, h=l,2,---,v 

v equations ( 10 ) 

6 This is the method of Lagrange multipliers; see his Mecanique analytique 
(1811), tome 1, p. 74; or Benjamin Williamson, Differential Calculus (Long* 
mans, 1893), Chapter 11. The least squares problem without parameters was 
worked out by Gauss. He called his multipliers correlata t not mentioning 
Lagrange. Many texts in least squares use the term “ correlates ” or “ cor¬ 
relatives ” in this connexion, but apparently none makes any mention of 
Lagrange. The reference to Gauss is his Supplementum Theoriae Combina- 
tionis Observationum Erroribus Minimis Obnoxiae (Gottingen, 1826; Werke, 
vol. 4), art. 11. 



54 


MORE COMPLICATED PROBLEMS 


[Sec. 27] 


Add Eqs. 8 and 10 and collect coefficients of the variations 8 : 

22 (WxiVxi [X/iF zi'tybVxi “I" 22 {WyiVyi [X^F yi^])^Vyi 

— [X*F«*]«A — [^hFb h ]&B — [\ h F c h ]8C - 0 , one equation (11) 

In Eq. 11 there are two kinds of summations — there is the summa¬ 
tion 22 running over all observations, and there is also the summa¬ 
tion over h, in which h runs from 1 to v } i.e., over all conditions. 
The latter summation will be denoted by the Gauss brackets [ ]. 

Here the number of parameters is taken as 3. If there were p 
parameters, there would be 2n + p variations. For practice, 
the student should write out Eqs. 8-15 with (e.g.) n — 3 and 
v = 2 with two parameters. There is no other way to gain 
familiarity with the development. 

Eq. 11 contains 2 n + 3 variations, 6V x i, 8V x2) 8V X $, • • •, &V yny 
8A, 8B, 8C. But on account of Eqs. 9, only 2n + 3 — v of these 
variations are arbitrary. Let Xi, X 2 , • • •, X„ be so chosen that v 
of the coefficients in Eq. 11 vanish; then the coeffi*ients of the 
variations in the remaining 2n + 3 — v terms must also vanish, 
because they are used with an equal number of variations, each of 
which is arbitrary. Then all the coefficients in Eq. 11 vanish, 
which means that 


Vxi 

f<, 

-ll 

11 

n equations; i = 1, 2 , • • 

•, n (12x) 

Vyi 

= - l^hFyi h ] 

Wyi 

n equations; i — 1, 2 , • • 

■,n (12 y) 

[K h F a h ] 

= 0 

one equation 

(13a) 

[x*n*] 

= 0 

one equation 

(136) 

[WS] 

= 0 

one equation 

(13c) 


Each residual (F*» or F yt ) in Eqs. 12 is inversely proportional 
to the weight w X i or w V i of the corresponding observation. Does 
this seem reasonable? If any observation is relatively infallible, 
having w — «, then its residual is zero; i.e., there is no correc¬ 
tion. In curve fitting, for example, it sometimes happens that 
all the x coordinates are free of error; the corresponding resid¬ 
uals are then 0, and the calculated values of x are the same as the 
observed. 



[Ch. IV] THE GENERAL PROBLEM IN LEAST SQUARES 55 


The v Lagrange multipliers (X) are no longer arbitrary; they 
now have particular values; they have been chosen so as to cause 
v of the coefficients in Eq. 11 to vanish (vide supra). Their values 
can be found from Eqs. 13 and 15. We shall now derive Eqs. 15. 

28. The general normal equations. Now substitute (1 /w xi ) X 
[XaF*» a ] for Ext) and likewise for 2 /, in the reduced conditions 
(Eqs. 7). Collect the coefficients of X 1; X 2 , • • X„, A, B, C , and 

in so doing set 

r F zl k F xl * , F z2 k F z2 * , , F zn k F xn * 

Lhk =-1-h • * * 1- 

w x 1 W x 2 Wxn 

+ VV + Ivllvl + . . . + Fyn h F yn > 

Wyl Wy2 U)yn 

= L kh (14) 

The following system of equations results. They may be called 
the “ general normal equations.” For convenience, only the 
coefficients are tabled, the unknowns being written across the top. 
On the left of the equality sign, each coefficient is to be multiplied 
by the unknown appearing above it, the plus sign between terms 
being understood. On the right, each F 0 is multiplied by unity, 
hence the heading “ 1 ” for that column. 


The general normal equations 


Xi 

M 

X3 

• \y 

A 

B 

C 

= 1 


Ln 

L21 

Lzi • 

• • L p 1 

Fa 1 

Fb 1 

Fc 1 

Fo 1 ] 


Lu 

L22 

Lzz • 1 

• • L p 2 

Fa 2 

F h 2 

Fa 3 

n 2 


Ln 

Lss 

Lss • ■ 

• Lp 3 

Fa 3 

F h 3 

F* 

Fo 3 

► ( 15 ) 

Lu 

L 2p 

L,. ■■ 

• Lpp 

Fa' 

Fb' 

F c ' 

Fo' , 


Fa 1 

FS 

Fa 5 • 

•• F a l9 

0 

0 

0 

0 

( 13 a) 

Fb 1 

Ft 2 

Ft* • 

F h ' 

0 

0 

0 

0 

( 135 ) 

Fc 1 

F c 2 

F c 3 ■ 

•• F* 

0 

0 

0 

0 

( 13 c) 



56 


MORE COMPLICATED PROBLEMS 


[Sec. 29] 


Remark 1. Along the diagonal, h — k, and off the diagonal, 
h k. By comparing Eq. 14 with Eqs. 8 and 9 of Chapter III, 
it can be seen that Lhhjs the weight of the condition function F h . 

In curve fitting it is in fact sometimes useful to write 1/W in 
place of L, as will frequently be done later. (Cf. also Remark 3 
on p. 135.) The term Lhk off the diagonal is the reciprocal of the 
product variance of the two condition functions F h and F*. It 
will thus be observed that the diagonal in the general normal 
equations is made up of the variances of the condition functions, 
every term of which is positive, and that the terms off the 
diagonal are product variances, which can sometimes be 
negative. 

Remark 2. Since Lhk = Lkh f as is indicated by Eq. 14, the 
coefficients of the unknowns in Eqs. 15 are symmetrical about 
the diagonal. Because of this symmetry, it will be possible, fol¬ 
lowing Gauss, Doolittle, and others, to shorten the numerical 
computation for finding the unknowns (Secs. 34 and 61). In 
the abbreviated solution, it is not necessary to enter the coeffi¬ 
cients below the diagonal (see Sec. 30). 

The general normal equations are v + 3 in number, and can be 
solved for the v + 3 unknowns written across the top. Special 
methods of solution will be taken up in Sections 34 and 61, but for 
the present we shall only note that once the residuals A, B , and C 
are found, the final (adjusted) values of the parameters are ob¬ 
tained by subtracting the residuals from the approximate values, 
as shown in Eqs. 6 . 

The solution of the general normal equations yields also numeri¬ 
cal values for the Lagrange multipliers \i, X 2 , • • •, X„, which through 
Eqs. 12 enable the residuals (F) to be calculated. The observa¬ 
tions X{ and Yi are then adjusted by subtracting the residuals, 
again according to Eqs. 6 . The adjusted quantities Xi, along with 
the adjusted parameters found by Eqs. 6 , will satisfy the v condi¬ 
tions expressed by Eqs. 3 (p. 50), or their equivalent, the reduced 
conditions, Eqs. 7 (p. 53). 

Exercise . Apply Taylor’s series to any one of Eqs. 3 to derive 
the corresponding reduced condition shown as Eq. 7. 

29. Short expression for S. The normal equations are really 
normal. The matrix of the coefficients is positive definite. By 

definition, S = £ w res 2 . Now by substituting for the residuals 



[Ch. IV] THE GENERAL PROBLEM IN LEAST SQUARES 57 
in terms of Eqs. 12 x and 12 y, we find that 

s = z — [X^-Y + L — [XaE /] 2 

x Wxi y Wyi 

= — (XiF ,! 1 + \ 2 F x1 2 + • • • + \ V F Z i ) 2 

u>xl 

H-(Xl F X2 1 + X 2 F x2 2 + • • • + X„F x2 V ) 2 

W X 2 

+ ‘ ‘ ‘ H-(Xl^yn 1 + X 2 F yn 2 + * • * + KF vn V ) 2 

Wyn 

= LllXl 2 + Z/22X2 2 + • • ■ + LppXy 2 

+ 2 (Li 2 XiX 2 + the other cross-product terms) 

Another way of writing this is to border the following symmetrical 
array. 



Xx 

x 2 

x 3 

X v 

A 

B 

c 

Xi 

I'll 

L 2 i 

L 31 

L v \ 

F a l 

F b l 

F * 1 

*2 

I '12 

L 2 2 

L32 

L V 2 

Fa 2 

F b 2 

F c 2 

X 8 

Ll3 

L 2 3 

L 33 

L v 3 

F a 3 

n 3 

F c 3 

Xp 

L u 

L 2 p 

Lz v 

■■ L„ 

/v 

n' 

Fc 

A 

F a l 

F a 2 

F a 3 

•• Fa 

0 

0 

0 

B 

Ft 1 

F b 2 

Fb 3 

F h 9 

0 

0 

0 

C 

F c l 

F 2 

F c 3 

• F 0 9 

0 

0 

0 

= 

XiFq 1 

+ X 2 Fo 2 + X 3 F o 3 + ■ * * + X„F 0 " 

- [Wo*] 


by Eqs. 15, page 55. We have thus discovered that 

S = [\ h F 0 h ] (17) 

In this way, S, the minimized sum of the weighted squares 
of the residuals, is expressible in terms of the Lagrange multipliers; 
wherefore, so far as S is concerned, it is not necessary to compute 
the residuals and square them. Later, we shall see that S can be 
computed by a systematic procedure without even finding the 
Lagrange multipliers (Secs. 34 and 61). It is a fact that in some 



58 MORE COMPLICATED PROBLEMS [Sec. 29] 

problems it is nevertheless advisable to compute the residuals, so 
that they can be examined individually (cf. Sec. 78). 

Gauss derived Eq. 17 for the case of geometric conditions, for 
which parameters are absent. For other special expressions of £, 
useful in curve fitting, when parameters are present, see the exer¬ 
cise following this section; also Exercise 3, page 163. 

Since S > 0, the quadratic form (16) is positive definite 6 ; that 
is, no matter what values be given to Xi, X 2 , • • •, X„, A, B, C> the 
quadratic form (16) can not be negative. The symmetry of the 
general normal equations (Sec. 28) has already been noted; hence 
these equations are really normal — i.e., they are not only sym¬ 
metric, but the quadratic form of the coefficients is positive 
definite. 

Exercise. Show that 


i'll 

L 21 

L 31 • 

• L v \ 

Fa 1 

F b l 

Fc 1 

Fo 1 

L\2 

L 22 

L 32 • 

■ L,2 

F a 2 

Ft 2 

F c 2 

F 0 2 

Ll3 

L 23 

L 33 ■ 

■ L,3 

F a 3 

F b 3 

F c 3 

Fo 3 

i'll. 

Li2v 

Fzv ■ 

• Lpy 

Fa'• 

Ft" 

F c " 

Fo" 

Fa 1 

F a 2 

F 2 ' 

■ Fa- 

0 

0 

0 

0 

Ft 1 

F b 2 

F b a ■ 

■ F b - 

0 

0 

0 

0 

Fc l 

F, 2 

F c 3 ■ 

■ Fc 

0 

0 

0 

0 

Fo 1 

Fo 2 

Fo 3 ■ 

■ For 

0 

0 

0 

0 

i'll 

L 2 I 

Lzi • 

• Lpi 

Fa 1 

Ft 1 

F c 1 


L\2 

L/22 

L 32 ■ 

• L v 2 

Fa 2 

F b 2 

Fc 2 


i'IS 

L>2Z 

Lz3 • 

• E v z 

F 3 

F b 3 

Fc 3 


i'll. 

L2v 

Lsv * 

• * L vv 

Fa" 

Ft" 

F c " 


Fa 1 

F a 2 

F a 3 • 

•• Fa 

0 

0 

0 


n 1 

F b 2 

F b 3 • ■ 

-• Ff 

0 

0 

0 


Fc 1 

F 2 

F c 8 • 

• Fc 

0 

0 

0 



6 Maxime B&cher, Higher Algebra (Macmillan, 1907), page 150. 






Part C 

CONDITIONS WITHOUT PARAMETERS 


CHAPTER V 

GEOMETRIC CONDITIONS 

30. Adaptation of the general solution to conditions without 
parameters. When the conditions imposed by Eqs. 3 or 7 (pp. 50 
and 53) of the last chapter are geometric, there are no parameters 
or adjustable constants. The quantities a, b, and c then do not 
exist, and in the general normal equations (p. 55), all the rows and 
columns containing A, B, C, F a h , F b h , and F c h are to be deleted. 
The Lagrange multipliers (X) are the only unknowns left, and only 
the square array of L coefficients remains. The general solution 
thus reduces to Eqs. 1 shown below. 


Xi 

x 2 

x 3 

x. 

= 1 

I'll 

L 12 

L13 

L\ v 

Fo 1 


L 22 

Z/23 

L 2 V 

Fa 2 



L 33 

Lzv 

F 0 3 




Lyp 

Fo’ 


Here the coefficients below the diagonal have been omitted, since in 
the abridged solution soon to be learned, those below the diagonal 
are not used. The coefficients are to be read “ down to the diagonal, 
then to the right." The unknowns are the v Lagrange multipliers. 

This type of problem (no parameters) was solved by Gauss, 1 
and is treated satisfactorily in many textbooks. It arises in 

1 See the reference to Gauss in Section 27, page 53. 

59 



60 


CONDITIONS WITHOUT PARAMETERS [Sec. 31] 


geodesy, surveying, and in astronomy, and this accounts for the 
attentions of Gauss, Bessel, and Encke, who were mainly interested 
in the problems of adjustment arising in astronomy. 

31. Example: the plane triangle. We shall return now to the 
triangle problem discussed in Section 3 (see Fig. 3, p. 7). The 
angles are measured with a transit. The weights might arise from 
the number of repetitions on each angle. 

Observations: X lf X 2 , X 3 

Weights: wi, w 2} w 3 

Calc’d values: x 2 , x 3 (to be found) 

Here there is only the one condition, namely, 

X\ + x 2 + x 3 — 180° (This corresponds to Eq. 3, p. 50.) (2) 

so we write 

Fix i, x 2 , x 3 ) = x\ + x 2 + x 3 — 180° (3) 

(There is only the one condition, so no superscript on the F is 
needed.) This condition function F will be zero when we are able 
to insert the adjusted values x 2 , x 3 into it. By inserting the 
observed values we calculate 

Fo = X\ + X 2 + X 3 — 180° (See Eq. 5, p. 52.) (4) 

F 0 is not zero unless X\ + X 2 + X 3 happens to be exactly 180°, 
in which case no question of adjustment arises. The derivatives 
of F are 

F x = F 2 = F s = l (See Eq. 4, p. 51.) (5) 

There is only one L coefficient (why?). It could be called L u 
but no subscript is needed, so we shall use simply L. It is cal¬ 
culated as follows: 


p P P JP P p 

t 2 r 2 ^ t 3 r 3 
Wi w 2 w 3 


1 + 1 + 1 
W 2 W 3 


(Eq. 14, p. 55) (6) 


There is but one normal equation, namely, 

LX = Fq 


(7) 



[Ch. V] 


GEOMETRIC CONDITIONS 


61 


The solution is 



The numerator, Fo, is the amount by which the observed angles 
fail to close. The denominator, L, is 1/wi + 1 /w 2 + l/w 3 , which 
happens to be equal to 1 /w F by Eq. 9 on page 40 (propagation of 
mean square error). 

After X is worked out numerically, we may find the three resid¬ 
uals by Eq. 12, page 54: 

Vx = - m 

Wi 

v 2 = — \f 2 

W 2 

V 3 = — \F 3 
w 3 

The adjusted angles are then 

x\ =* X\ — V t ; x 2 = X 2 - V 2 ; z 3 = X 3 - V 3 (10) 
The sum of the adjusted angles is identically 180°, for 
zi + z 2 + x 3 = Xi + X 2 -j- X 3 — (V\ + V 2 + F3) 

= Xi + x 2 + z 3 

Xi + X 2 + X 3 - 180° 
w 2 w 3 

= 180° exactly ( 11 ) 

The equations for Fi, V 2 , and F 3 are valid no matter how 
large F 0 is. This is a case where there are no higher powers of 
the residuals to be neglected, and is in contrast with the more 
general statement in footnote 4, page 52. 

Note that the residuals are inversely proportional to the 


-(- + - + -) 
\W1 w 2 w 3 / 




62 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 31] 


weights of the observations; that is, 


Vi : V 2 : F 8 - 


jL.j_.ji_ 

w\ ' w 2 ' Wa 


( 12 ) 


Thus, in this problem, the adjustment by least squares simply 
takes the excess or deficiency F o (which will ordinarily be a small 
amount, perhaps a few minutes of arc) and distributes it among 
the three angles in inverse proportion to their weights (cf. Ch. I, 
p. 8). The student should reflect on this at length. If the 
action of least squares seems reasonable in this simple problem, 
it may be so in more complicated ones, even if we are not so 
easily able to visualize its working. Even in more complicated 
problems, the principle is the same (the minimizing of £ w res2 
or of x 2 ); it is only the conditions to which the adjusted values 
are subject that differ from one problem to another. 

Exercise 1. Show that the condition 


+ X2 + £3 = 180° (2) 

determines a plane distant 180°/V 3 from the origin, and cutting 
equal intercepts from the axes. The calculated point lies on the 
plane, and the observed point off it. If the weights are all equal, 
the distance between the observed and calculated points is to be 
minimized, in which case the line segment joining the observed and 
calculated points is perpendicular to the plane x\ + x 2 + £3 = 
180°. See Fig. 11. 

If the weights of the observed angles are unequal, the distance 
between the observed and calculated points is not to be mini¬ 
mized, but rather the quantity 

wi(Xi - xi) 2 + w 2 (X 2 - x 2 ) 2 + w z (X 8 - z 8 ) 2 (13) 

Exercise 2 . Any possible plane triangle is represented by a 
point on this plane for which x\, x 2 , and £3 are positive. Any 
method of adjustment would consist of picking off some point on 
this plane, corresponding to a given observed point Xi, X 2 , X 3 off 
the plane. 

Exercise S. Solve the triangle problem (p. 60) without the 
Lagrange multiplier. 

Hint: Take S = £ wV 2 = £ w(X - x) 2 


(14) 



[Ch. V] 


GEOMETRIC CONDITIONS 


63 



Fia. 11 . The three angles of a plane triangle constitute the coordinates of 
a point. The calculated (or adjusted) angles add to 180 °. The point repre¬ 
senting the calculated angles lies on a plane distant 180 °/V3 from the origin. 
The observed point lies beyond the plane if there is an observed excess beyond 
180 °, but lies on the under side of the plane if there is an observed deficiency. 
It lies on the plane only by accident, in which case no adjustment is required. 

By the one and only condition on the adjusted values, we may take 


£3 = 180° — x\ — x 2 

(15) 

II 

0 

1 

1 

to 

(16) 


where, as before, 

F 0 = X l + X 2 + X s - 180° 

Then 

S = WiVi 2 + w 2 V 2 2 + w 3 (F 0 -Vi- V 2 ) 2 (17) 

xi and x 2 are independent; so are T'i and V 2 . Hence we may 
set dS/dV i and dS/dV 2 both equal to zero. The result is 

wyVi - w 3 (F 0 -Vi- V 2 ) = 0\ 

w 2 V 2 - w a ( “ ) - Oj U ’ 


It follows that 


W1V1 = W2V2, 


(19) 


64 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 32] 


and that 


Vi = 


V a = 


r 3 = 


wi J_ J_ J_ 

Wl 1^2 ^3 


^2j_ + + _L 

^2 ^3 

1 1 
W 3 J_ + J_ + J_ 

tUl ^2 ^3 


*0 


( 20 ) 


which are equivalent to Eqs. 9 on page 61, obtained with the 
Lagrange multiplier. 


All problems in least squares can theoretically be solved with¬ 
out the use of Lagrange multipliers. Occasionally it may even 
seem easier to dispense with them, but most problems then be¬ 
come hopelessly involved, as Kummell discovered. 2 


32. The plane triangle continued. The weights of the adjusted 
angles, and any function of them. Returning to the triangle 
problem of the last section, suppose we ask for 

The weight of angle X\ 
and 


The weight of the sum of x\ + x 2 + z 3 , after adjustment 

Of course we know in advance that the weight of this sum must be 
infinite, since we forced it to be a definite amount, 180°; but it 
will be interesting to see if this result comes by the routine about 
to be described. The rules for finding the weights of functions of 
the adjusted observations are illustrated in what follows, and a 
more complicated example will be worked out in the next chapter. 
The theoretical proofs will be found in several books on least 
squares, for example, 0. M. Leland’s Practical Least Squares 

2 Charles H. Kummell, The Analyst (Des Moines), vol. 6,1879: pp. 97-105. 



[Ch. V] 

GEOMETRIC CONDITIONS 

65 

(McGraw-Hill, 

1921) and T. W. Wright and J. F. 

Hayford’s 

Adjustment of Observations (Van Nostrand, 1884, 1906). 


Let 

G 1 = X , 

(21) 

and 

G 2 = xi + x 2 + x 3 

(22) 


G l and G 2 are then the functions whose weights are wanted. As 
many more functions could be added as desired, but here we shall 
be content to see just the weights of x\ and of x\ + x 2 + £3 worked 
out. The procedure is as follows. We need to form certain sums, 
and to this end we make up the following table, numerical values 
ordinarily being inserted in place of the symbols in the body of 
the table. F is defined by Eq. 3 on page 60. 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

t 

Fi 

Gi 1 

Gi 2 

Fi 

y/Wi 

Gi 1 
y/Wi 

Gj 2 

y/Wi 

Sum 

1 

1 

1 

1 

1 

1 

1 


Vm 

y/Wi 

Vm 


2 

1 

0 

1 

1 

VW2 

0 

1 

Vm 

for numerical 
check 

3 

1 

0 

1 

I 

V^3 

0 

l 

Vm 



Next step, from columns 5, 6, and 7 the sums called for in the 


normal equations can be evaluated, as 

shown below. 

p,G/l 

L Wi . 

1 

W\ 


as defined on 

VGi'Gi 1 -] 

1 

. l 

U ll (l 

L Wi _ 

W\ 

L Wi J 



[ ] means summation, as in Section 27 (Gauss’ notation). These 
sums are appended in the C 1 and C 2 columns, and the solution 
proceeds. 



66 


CONDITIONS WITHOUT PARAMETERS [Sec. 32J 


Row 

X 

= 1 

C l 

C 2 

I 

L 

Fo 

rwi = ± 
L Wi J Wl 


2 

How obtained 

0 

r^wi 1 
L Wi J Wl 


3 

Row I X - 7 

L 

FqF 0 
~~ 


... 

II 

2+3 

F oFq 
“ ~L~ 

... 


4 

Row I X - -7 

w\L 


1 

Lwi 2 


II 1 

2 + 4 


J_ 1 

w\ Lw \ 2 


5 

1 

X 

1 —i 

£ 

0 

Ph 



-L 

II 2 

2 + 5 



0 


In a numerical solution, a sum column would be introduced at 
the right for a check, and the spaces filled in by the ellipses would 
be filled in with numbers (see the note in reduced type on p. 33). 
For a numerical illustration see pages 82 and 83. 

Row I gives X = F 0 /L, as already found on page 61. Looking 
next at the “ 1 ” column in Row II we see — F 0 Fo/L, which has the 
value — \F 0 , and which by Eq. 17 on page 57 is none other than 
— S. Thus, £ wV 2 is computed in a routine manner without first 
finding the individual residuals V\, V 2 , and V 3 . 

The variance coefficient of G 1 , or the reciprocal of its weight, 
appears in the C 1 column of Row II 1 ; and the variance coefficient 
of G 2 } or the reciprocal of its weight, appears in the C 2 column of 
Row II 2 . 

Before adjustment, the weight of G 1 was w\, the weight of the 
observation X\\ after adjustment, the reciprocal of its weight is 











[Ch. V] GEOMETRIC CONDITIONS 67 

l/i^i — 1/i^i 2 * Now of course 

1 11 
w\ Lw\ 2 w\ 

which means that the weight of x\ is greater than the weight of X\. 
That is, the weight after adjustment is greater than the weight before^ 
which seems reasonable enough; the observations on the other two 
angles help to estimate x\ } and to increase our confidence in its 
value. After the adjustment we feel that we know more about 
the triangle than before. 

In 'particular if all three angles have the same weight before 
adjustment, then if W\ = w 2 = w 3 = 1 , the weight of Xi after 
adjustment is l/{l/wi — 1/Lwi 2 } = 1 /{1 — 1/3} = 1.5, which 
is 50 percent greater than the weight of X\. The adjustment 
therefore increases the weight hy 50 percent The same thing is of 
course true for the other angles. 

If the weight of an angle has been increased 50 percent, its 
standard error has diminished 18 percent, since 

w before : W after = (S-E. after • S.E .before) 2 (See Eq. 16, p. 22.) 

Next consider the weight of x\ + x 2 + £ 3 - From Row II 2 in 
the form above we see that the reciprocal of the weight of this 
function is zero; in other words, the weight of x\ + X 2 + £3 is 
infinite; it is therefore known absolutely. The adjustment 
forced the sum to be 180°, and it is no surprise to find its weight 
after adjustment to be infinite. 

This simple example gives a glimpse of the method for the solu¬ 
tion of problems involving rigorous conditions. A guide for 
systematic computation, and a more complicated example, are 
given in the next section. 


Exercises 

Exercise 1. Take the values of V\> V 2 , and Vs found earlier, 
namely \/wi, X/w 2 , an( * W w s> an d show by direct substitution that 
XFo, the negative of the extreme left entry in Row II of the tabula¬ 
tion shown above, is actually S , or w\Vi 2 + w 2 V 2 2 + W 3 F 3 2 . 



68 


CONDITIONS WITHOUT PARAMETERS [Sec. 32} 


Exercise 2 . By the use of Eq. 7, page 40, show that the variance 
of the sum x\ + X 2 + of the adjusted angles of a plane triangle 
is 0; hence its weight is infinite. 

Hint: x 9 - 180° — xi — x 2 

whence 

Var (xi-\-X 2 -\-Xz) — <T\ 2 +<T 2 2 -\-<T 9 2 -+-2ri 2 <Ti<T 2 +2ru<Ti(r 9 


4-2r230’20'3 (a) 

2 = (7"2 2 + o 3 2 + 2r 2 30- 2 0 , i ( b) 

02 2 = <T 1 2 4“ 0‘S 2 + 2ri 3 o-icr 8 (c) 

cs 2 = ^i 2 + ^2 2 4- 2ri2(7i(r2 (d) 

By combining the last three equations with Eq. a it is found 

that 

Var (xi 4- x 2 4- 23) = 0 (e) 


Exercise 8. Observations Xi, X 2 , • • •, X n , with weights w\ t 
w 2} • • •, w n are taken on n quantities, the adjusted values of which 
are connected by the one condition 

X\ 4 ~ X 2 + • * * + %n — C 


By making use of the scheme outlined in Section 32 for finding the 
weight of a function after adjustment, show that the weight U r of 
the sum xi + x 2 + • • • + x r , r < n, after adjustment , is 


where 


and 


U r 

J_ 

W r 

w 



Wi w 2 w r 



Wi w 2 w n 


In particular, if w\ = w 2 = • • • = w n = 1, 


r(n — r) 



[Ch. V] 


GEOMETRIC CONDITIONS 


69 


If n = 3 and r = 1, U r = 3/2, which is the special case of one 
angle of a triangle, already worked out on page 67. If n ~ 4, as 
for a quadrilateral, one angle, after adjustment , has the weight 
4/3, the sum of any two angles has the weight 1, and the sum of 
three angles has the weight 4/3. 


This problem has application also in the social sciences, where 
proportions are observed by sampling methods, and the total 
count is known from other sources (Ch. VII). For a cell that 
is not too small (i.e., for one having a sample frequency of pos¬ 
sibly 10 or higher), the weight of the observed frequency may be 
assumed inversely proportional to that frequency, and the 
variance thereof equal to the cell frequency. 

Suppose that n\ and ri 2 are the observed sample frequencies in 
a two-celled table, the total count of the two cells being known. 
If ni + nz = n, then n\/n and n^/n are the observed propor¬ 
tions. Denote them by p and q. Then p + q = 1 , L = l/wi -\- 
I/W 2 = ru + ri 2 = n, and the weight U i of the cell n\ after 
adjustment is given by the equation 


77 - = variance of n\ = — 
U\ w 1 


1 

nwi 2 


— n\ -- n\q — npq 

n 

Thus the variance of n\ is reduced from n\ to npq by the adjust¬ 
ment. The variance of the proportion p is reduced from p/n to 
pq/n. The ratio of the variance after adjustment to the vari¬ 
ance before adjustment is thus equal to q. The reduction in 
variance is considerable when q is small, i.e., when p is nearly 
unity, as happens when n\ is nearly all of n. 



CHAPTER VI 


SYSTEMATIC COMPUTATION FOR GEOMETRIC 
CONDITIONS 


33. Steps in the formation of the normal equations. There 
will be observations, weights, and conditions imposed on the 
adjusted values. 

Observations: X\, X 2 , • • •, X n 
Weights: w\, w 2 , • • *, w n 


Conditions: F l (z\, x 2 , • • •, x n ) = 0 
F 2 (x 1 , x 2 , • • x„) = 0 

F 3 (xi, x 2 , ••■,x„) = 0 
F*(x i, x 2 , •••,*„) = 0 


(These are Eqs. 3, 
p. 50, except that (1) 
here there are no 
parameters.) 


1st step. Write down the conditions, i.e., select the appropriate 
F functions. Decide also on the G functions whose weights are 
wanted. One then works out the values of F 0) which will usually 
turn out to be small numbers, since the conditions will be nearly 
but not quite satisfied by the observations; see for instance 
page 77. 

By the use of the reciprocal matrix as explained in Section 36, 
one need not decide on all his G functions at the start; more can be 
added later without great inconvenience. 


The solution will be illustrated with four conditions; i.e., the 
number v in Eqs. 3 on page 50 is taken as 4, which will be the 
number of Lagrange multipliers (X). Expansion or contraction to 
more or fewer conditions is easy. (In the simple triangle problem 
of Sec. 31, p. 60, there was only one condition, and one X.) 

We shall assume here that we want to find the weights of two 
functions of the adjusted values. Let these functions be desig- 

70 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 71 


nated as 

G 1 (x u x 2) • • •, x„) 

G 2 (x h x 2 , ■ ■ ■, x„) 

(In the triangle problem of Sec. 32, G 1 was x i} and G 2 was 
xi + x 2 + see p. 65.) 

2d step. This requires some differential calculus. It consists 
of writing down the various derivatives that are needed, such as 

„ . dF l 
F 2 1 or 


. dF 1 
Fi or —» 

dX\ 


Fs 2 or 


dF 2 
dx 3 ’ 


or 


dx 2 

dF 3 
dx 2 


Etc. 


These are used for forming the L coefficients, according to Eq. 14 
on page 55. We shall also need the derivatives of the G functions, 
such as 


dG l 

0 i or dGl 

dX\ 

tr 2 or —- 
dx 2 

dG 2 

G 3 or aG3 

dx 3 ’ 

(jr 2 or —— 
dx 2 


Etc. 


which are to be used in computing the weights of the G functions. 

3d step. Work out the numerical values of the derivatives; see, 
for instance, page 78. In each case, the observed values X\, 
X 2f • • •, X n are used in place of the adjusted quantities x\, x 2y • • •, 
x n , since approximate values of the derivatives are usually close 
enough; at least they will have to suffice till we can get better ones. 
The following table is made up, numerical values being inserted in 
the spaces. Naturally, more or fewer columns will be needed in 
various problems, and different computers will work differently 
even on the same problem. The layout will also vary, depending 
on what type of calculating machine is available. Only general 
directions can be given in advance of a specific problem. 



72 


CONDITIONS WITHOUT PARAMETERS [Sec. 33] 


TABLE 1 (3d step) 


i 

wi 

F .l F . 2 F . 3 pA Q.l Q. 2 

Sum 

1 




2 




n 





The sums at the right in Table 1 are formed exclusive of the entries for the 
weights. They are useful in checking the formation of Table 2. 


4th step. Form Table 2, which is derived from Table 1, by 
multiplying the F and G derivatives by the corresponding values of 
\/\Zwi, as indicated in the headings of Table 2. 


TABLE 2 

The matrix for the formation of the normal equations (4th step) 


i 

Fj 1 

Vwi 

Fi 2 

y/Wi 






1 

2 








n 








Sum 



...v 


Table 2 is termed a matrix because from it is formed the normal equations. 
Moreover, in matrix notation, the formation of the normal equations is the 
product M'M, M being the matrix of Table 2, and M' its “ transpose.” 

The sums shown at the right and across the bottom of Table 2 are used for 
checking the formation of the normal equations. The sums themselves are 
checked by adding them down and across, to see that they add to the same 
grand total either way (the “ comer check ”). 

There are various procedures that one can follow in computing Table 2 
from Table 1. With automatic multiplication, the computer may prefer to 
use l/V^i as a constant factor in row i, reading off the individual products 














[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 


73 


(Fx/^/wi, etc.) and entering them in Table 2, cumulating the sum Ff- -f 
Fi 2 + Fi Z + F» 4 + Gi l + Gi 2 of the multipliers to check with the sums already 
entered in Table 1. 

With a machine having two multiplier registers, one for cumulating 
the quotients, and the other for reading individual quotients, the computer 
may cumulate a sum in either the horizontal or vertical without extra effort. 
If the machine moreover permits the dividend to be altered independently of 
the keyboard, one may set y/w{ on the keyboard and use it for a divisor 
throughout an entire horizontal row of Table 1, entering the individual quo¬ 
tients in Table 2, and at the same time cumulating the sum to be entered 
at the right. 

The use of punch card equipment for forming normal equations may save 
time and expense on large projects. 

5th step. The coefficients in the normal equations are now to be 
formed from Table 2. By recalling the definition of L h k in Eq. 14 
on page 55, and by introducing the C l and C 2 columns for the 
weights of the G functions as used in Chapter V, we may rewrite 
the normal equations of page 59 in the form shown below. 
It will be observed that the terms on the diagonal are sums of 
squares formed from the columns of Table 2, and that the terms off 
the diagonal are the sums of cross-products formed from these 
columns. The numbers entered in the C 1 and C 2 columns are like¬ 
wise the sums of squares and cross-products. 


Normal equations 

_ Unknowns _ 

Row Xi X2 X3 X4 = 1 C 1 C 2 Sum 


nwi rwi JW] rwi ( 
L wi J L wi J 0 L Wi J L wi J 

e diagonal f/W1 4 [ FjW l [Fj*Gi H 

cnetry L J ° L v>i J L u>» J 









74 


CONDITIONS WITHOUT PARAMETERS [Sec. 34] 


The sum column at the right checks the formation of the normal 
equations. Herein are entered (in pencil) the cumulation of the 
cross-multiplications formed with the sum column of Table 2; these 
should agree with the sums of the terms in the normal equations, the 
“ 1 ” column excluded; see Table 3 in Section 34, and the check 
formed immediately below. If no errors are found, the sums entered 
in pencil at the right of the normal equations are altered to include 
the “ 1 ” column, and the solution proceeds, being checked at the 
pivotal points (see the check marks in Rows II, III, and IV of 
the numerical solution in Sec. 34). The sums [Gi l Gi l /w x ] and 
[Gi 2 Gi 2 /wi] must be checked otherwise, as by repetition. 

The 0 in the bottom row of the normal equations is appended for 
the computation of the minimized sum of squares, S. The columns 
C 1 and C 2 assist in the computation of the weights of the functions 
G 1 and G 2 . 

The solution of the equations is to be carried out by the routine 
process already seen in simplified form on page 66, and to be illus¬ 
trated more fully on pages 82-83, and symbolically on page 158. 
When the numerical values of the Lagrange multipliers (X) have 
been worked out, the residuals V\ t • • ■, V n are to be calculated by 
Eq. 12 on page 54, and then used to find the “ adjusted observa¬ 
tions ” x\ , x 2 , • • •, x n as follows: 

<v 

Xi = X 1 -V 1 = X 1 -- (XiF^ + X^ + XaF^+X^! 4 ) 

Wi 

X 2 *=X 2 —V 2 *=X 2 -(XlE2 1 +X2F2 2 + \3F2 3 +\4E2 4 ) 

w 2 (3) 

x n =X n -V n = X n - — (X 1 E n 1 +X 2 Fn 2 +X 8 F n 3 +X 4 F n 4 ) 

Wn 

It should be noted that the numerical values of the derivatives 
Fi 1 , F 2 > etc., required in the parentheses, are ready for use in 
Table 1, p. 72. 

34. Numerical example: a surveying problem. A surveying 
party measures the sides and angles of the plane triangle PQR , 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 75 


with the following results: 


On angle P : 

51° 06' 

(4 observations) 

08 


05 


06 

Average i 

51° 06'.25 

On angle Q : 

95° 05' 

(2 observations) 

04 

Average 

95° 04'.5 

On angle R : 

33° 49' 

(2 observations) 

50 

Average 

33° 49'.5 


Side p: 1723.7 ft. 
“ q : 2205.4 “ 
" r: 1232.7 “ 


P 



R 

Fig. 12. The sides and angles 
of this plane triangle have been 
measured. The sum of the 
adjusted angles must be 180°; 
and the adjusted angles and 
sides must satisfy the sine law. 


The transit man, from previous experience, has reason to believe 
that the standard error of single measurements on one angle is 
about one minute of arc, or 0.00029 radian. He takes the standard 
error of the chainmen to be one foot in 10,000 feet, and in propor¬ 
tion to the square root of the distance chained. The weights of the 
observations on the angles and sides are then in ratios as follows: 


wp : wq : wr : w p : w q : w r 


4 2 

0.000 29 2 * 0.000 29 2 * 


2 

0.000 29 2 



(4) 


These ratios come from Eq. 13 on page 21, wherein the weight 
of a function / was defined to be inversely proportional to its 
variance. Since weights are relative and not absolute, the factor 
of proportionality (a 2 ) in Eq. 13 on page 21 is arbitrary and can 





76 


CONDITIONS WITHOUT PARAMETERS [Sec. 34] 


be chosen for convenience; accordingly we let 

<r 2 = % 0.00029 2 = 4.23 X 10“® (5) 

whereupon the weights take these simple values: 

w P = 2, w Q = 1 , w R = 1 , 

w p = 24.6X10“®, w q = 19.2X10“®, w r = 34.3X10“® 

It should be noted that the final adjusted values of the sides and 
angles, also their standard errors, are in no way dependent on the 
arbitrary choice made for cr 2 ; if a 2 is doubled, all the weights are 
also doubled, and the standard errors of all functions are left 
unaltered. Likewise x 2 is unaltered. 

The solution of the problem proceeds now according to the steps 
outlined at the beginning of this chapter (Sec. 33). 

1st step . The adjustment must be carried out to enforce the 
following three geometrical conditions: 

sin P _ sin Q _ sin R 

V q r 

P + Q + R = 180° + e (8) 

€ being the spherical excess, which, owing to the small size of the 
triangle, will here be taken as zero. If it were other than zero, Fq 3 
(Eq. 10) would be altered by the amount €, and the adjusted values 
of the sides and angles and their standard errors would all be 
affected in an obvious manner. 

For forcing the three conditions, let us set 

sin P sin Q 

V Q 

sin P sin R 
p r 

P + Q + R- 180° 

(The number of conditions is 3; i.e., the number v occurring in 
Eqs. 3 on page 60 is 3.) 


F l {P, Q, R, p, q, r) = 


p2^ii u a u u _ 
u a a ft tty _ 





[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 77 


F 1 , F 2 , F 3 when evaluated do not give zeros, but give the small 
numbers Fo 1 , F 0 2 , F 0 3 , which by direct substitution are found to be 

j sin 51° 06'.25 sin 95° 04'.5 

0 “ 1723.7 2205.4 

= -1.3271 X 1(T 7 
2 sin 51° 06'.25 sin 33° 49'.5 

0 “ 1723.7 1232.7 

= -0.5416 X 10“ 7 

F 0 3 = 51° 06'.25 + 95° 04'.5 + 33° 49'.5 - 180° 

= 0° 0'.25 = 7.27 X 10" 5 radian 

If it had happened that the observations satisfied the conditions 
exactly, then Fo 1 , F 0 2 , and F 0 3 would have turned out to be zeros, 
and the adjusted values would have been identical with those 
observed. As it is, the observations satisfy the conditions nearly 
but not exactly, i.e., Fo 1 , F 0 2 , and F 0 3 are small but not zeros. 

F 0 3 is the amount by which the sum of the angles exceeds 180°. 
In the simpler problem wherein the sides were not measured (vide 
supra, Sec. 31) it turned out that the least squares adjustment 
was simply an apportionment of this discrepancy among the three 
angles in inverse proportion to their weights. Now, however, the 
sides are involved; wherefore the adjustment, though possibly as 
reasonable as before, will not be so easy to arrive at. By looking 
ahead to page 84 we see that, in contrast with the residuals on 
page 61, the adjustments on the angles will not now be all in the 
same direction. 

Now suppose that for some reason or other we should like to 
know the weights of 

Angle P 

The sum P + Q + R 

The area of the triangle, which may be expressed as \pr sin Q 

Any number of others could be added (at increased labor) but three 
will suffice here. For those just named we take the three G 
functions 

G 1 = P, G 2 = P + Q + R, G 3 = %pr sin Q (11) 




78 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 34] 


2d step. The derivatives of the F functions are 

„ , cos P „ 9 cos P 

F P 1 =- F P 2 =- 

V V 

Fp 3 = 1 

V--" 8 ® V-« 

t-H 

II 

w 

JV-o v--” 5 * 

r 

i —• 

II 

cc 

B? 

, Bin P 2 sin P 

tp ~ v 2 tp ~ p 2 

w 

II 

O 

j sin Q 2 

Fq ~ q 2 Fq “ ° 

F 3 = 0 

„ i ^ _ 9 sin R 

Fr 1 =0 F 2 = —J- 

F 3 = 0 


The derivatives of the G functions are 


Gp 


1 _ 


1 


The other 
five derivatives 
are zero 


Gp 2 = 1 

G q 2 = 1 
Gr 2 = 1 
The other 
three derivatives 


G P 3 = 0 

Gq = hvr cos Q 
Gr 3 = 0 
G P 3 = \ r sin Q 
G q 3 = 0 
G 3 = \v sin Q 


( 12 ) 


(13) 


3d step. The nearest numerical approximations that we can 
produce for these derivatives are found by substituting the 
observed angles and sides into the expressions just worked out, 
and these approximations will be more than close enough. 


TABLE 1 


The derivatives (3d step) 


i 

Wi 

y/Wi 

10 6 fV 

l 0 «F f 2 

F f 3 

Gi' 

Gi 2 

Gi* 

P 

2 

1.41 

364 

364 

1 

1 

1 

0 

Q 

1 

1 

40.1 

0 

1 

0 

1 

-93916 

R 

1 

1 

0 

-674 

1 

0 

1 

0 

P 

24.6- 1CT 8 

4.96 • 10 -4 

-0.262 

-0.262 

0 

0 

0 

613.9 

? 

19.2 “ 

4.38 “ 

0.205 

0 

0 

0 

0 

0 

r 

34.3 “ 

5.86 “ 

0 

0.366 

0 

0 

0 

858.5 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 79 

4th step, y/w is now used as a divisor to form Table 2 from 
Table 1. 


TABLE 2 

The matrix for the formation of the normal equations (4th step) 


i 

i<W 

y/Wi 

10 3 Fi 2 

y/Wi 

s/Wi 

Vm 

Gi 2 
y/wi 

i<r 6 <?i 3 

y/tvi 

Sum 

Si 

P 

0.257 

0.257 

0.707 

0.707 

0.707 

0 

2.635 

Q 

0.040 

0 

1 

0 

1 

-0.094 

1.946 

R 

0 

-0.674 

1 

0 

1 

0 

1.326 

V 

-0.528 

-0.528 

0 

0 

0 

1.238 

0.182 

Q 

0.468 

0 

0 

0 

0 

0 

0.468 

T 

0 

0.625 

0 

0 

0 

1.465 

2.090 

Sum 

0.237 

-0.320 

2.707 

0.707 

2.707 

2.609 

8.647 V 


The powers of 10 in Table 2 are chosen with regard to conven¬ 
ience, and to bring the number of decimals to uniformity from 
column to column, to facilitate the cumulation of squares and 
cross-products in forming the normal equations (the next step). 

At this stage one may also cut off superfluous figures, reserving, as 
a rule, not more than three or four in the largest number occurring 
in any one column. This often means that some other entries in 
the same column appear as zeros, but this is as it should be. 

5th step. The cumulations of squares and cross-products from 
the columns of Table 2 provide the coefficients required for the 
normal equations (Eqs. 2, p. 73). For instance, 1 

10 6 - = 0.257 2 + 0.040 2 -f 0 2 + 0.528 2 

+ 0.468 2 + 0 2 = 0.565 (14) 

as seen under Xi in the normal equations. Also 
TF 2 F 3 1 

10 3 I I - 0.257 X 0.707 + 0 - 0.674 + 0 + 0 + 0 + 0 

= -0.492 (15) 


1 The subscript i will be omitted for convenience occasionally. 


80 


CONDITIONS WITHOUT PARAMETERS [Sec. 34] 


as seen under X 3 . The student should verify the whole set appear¬ 
ing in Table 3. 

TABLE 3 

The cumulation of squares and cross-products from table 2 for the 

FORMATION OF THE NORMAL EQUATIONS 


ryi/^n 

ryi/? 2 -i 


10® - = 0.565, 

10® - = 0.345, 

10 3 - = 0.222 

L to J 

L ui J 

L w J 


r/r 2 F 2 -i 

r F 2 F zn 


10® - = 1.190, 

10 3 - = -0.492 


L to J 

L w J 



fF 3 F 3 l 



- = 2.500 



L w J 

, r W 

vf 1 g 2 ~\ 


10 s - = 0.182, 

10 3 - = 0 . 222 , 

10 ~ 3 - = -0.657 

L w J 

L w J 

L to J 

, IW 1 

, rF 2 on 

, r f 2 g 3 ~] 

10 s - = 0.182, 

10 3 - = -0.492, 

10“ 3 - = 0.262 

L w J 

L to J 

L to J 

rwn 

rF 3 G 2 l 

- f F 3 G Z ~\ 

- - 0.500, 

- = 2.500, 

10“® - = -0.094 

L u) J 

L w J 

L to J 

FG'Gn 

VG 2 G 2 '1 

19 r^n 

- = 0.500, 

- = 2.500, 

10 ~ 12 - = 3.687 

L m J 

L to J 

L to J 



\-Fhl 



- = 0.878* 



L to J 



rF 2 s“i 



- = 0.994* 



L to J 



[*]-»- 


* Check (Powers of 10 are disregarded in the sum checks): 

0.565 + 0.345 + 0.222 + 0.182 + 0.222 - 0.657 = 0.879 

0.345 + 1.190 - 0.492 + 0.182 - 0.492 + 0.262 = 0.995 

0.222 - 0.492 + 2.500 + 0.500 + 2.500 - 0.094 = 5.136 

The sums formed below the table to provide a check do not 

agree exactly with the numbers starred in the table, which are 
formed with the sums Si of Table 2, but the agreement is within 
errors of rounding off, whereupon we conclude that the arithmetic 







[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 81 


in Table 3 is correct, save for the three [GG/w] sums, which must 
be checked independently, as by repetition in reverse order. The 
cumulations shown in Table 3 are then entered into Rows I, 2, 3, 
4 of the tabular scheme for the normal equations on the two fol¬ 
lowing pages. The numbers entered in the “ 1 ” column of Rows 
I, 2, and 3 come from the values of F 0 1 f F 0 2 , and F 0 3 on page 77 
after multiplication by appropriate powers of 10 to produce deci¬ 
mals of the same denomination as the other parts of the normal 
equations. (The factor lO*” 6 applies to the whole of the “ 1 ” 
column.) 

The sums at the right of the normal equations are not the num¬ 
bers 0.879, 0.995, and 5.136 previously seen in the check under 
Table 3 but are these numbers to which have been added the corre¬ 
sponding entries of the “ 1 ” column; the normal equations thus 
start off with a sum column that provides checks at the pivotal 
points of the solution (note the check marks in Rows II, III, and 
IV). 

The solution proceeds according to the directions under “ How 
obtained.” The same system of solution has been seen in simple 
problems on pages 20,33, and 66, and will be seen again on page 158 
and in Chapter XI. 

35. Conclusions from the solution of the normal equations. 

1°: From Row IV, 

S or I wV 2 = 0.042 • 10" 6 
It follows from Eq. 21 on page 28 that 

a 2 (ext) = 0.042 • 10~ 6 (6 - 3) = 1.4 • 10~ 8 

Since this is only about one-third the prior a 2 arbitrarily chosen on 
page 76, we conclude that so far there is no indication of blunders in 
the observations or recording. 

2°: Rows 13,12, and 11 in the solution on pages 82 and 83 give 
Xi = -0.308, X 2 = 0.073, X 3 = 0.071 • 10~ 3 
These used in Eq. 12, page 54, give 

Vp = ^(XiFp 1 + \ 2 Fp 2 + X 3 Fp 3 ) = -0.0000075 radian 
= —0.03 min. 



82 


CONDITIONS WITHOUT PARAMETERS [Sec. 35] 



Combined solution of the 

NORMAL EQUATIONS, 

Unknowns 

THE COMPUTATION 

Row 


lO-'Xi 

10“ 6 X 2 

10“ 3 X 8 

= 1 

I 


0.565 

0.345 

0.222 

-0.133X10-* 

2 



1.190 

-0.492 

-0.054 

3 




2.500 

0.073 

4 





0 


Factors 





5 

0.345/0.665 = 0.6106 


-0.211 

-0.136 

0.081 

II 



0.979 

-0.628 

0.027 

6 

0.222/0.565 =0.3929 



-0.087 

0.052 

7 

0.628/0.979 = 0.6415 



-0.403 

0.017 

III 




2.010 

0.142 

8 

0.133/0.565 = 0.2354 




-0.031 

9 

0.027/0.979 = 0.0276 




-0.001 

10 

0.142/2.010 = 0.0706 




-0.010 

IV 





-0.042 


13 


10-% =-0.3081 

12 


10-*X 2 = 0.073 XX10-* 

11 


10 - S X 3 = 0.071 J 

14 

0.182/0.565 = 0.3221 


15 

0.071/0.979 = 0.0725 


16 

0.474/2.010 = 0.2358 


IV 1 



17 

0.222/0.565 = 0.3929 


18 

0.628/0.979 = 0.6415 


19 

2.010/2.010 = 1 


IV 2 



20 

0.657/0.565 = 1.1623 


21 

0.663/0.979 = 0.6772 


22 

0.589/2.010 = 0.2930 


IV 8 




(The powers of 10 written at the tops of the “ 1,” C l , C 2 





[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 83 


OP AND 

THE WEIGHTS 

OF THREE 

FUNCTIONS 

c l 

C 2 

c 3 

Sum 

0.182X10-® 

0.222 X10 -3 

-0.667X10® 0.746V 

0.182 

-0.492 

0.262 

0.941V 

0.500 

2.500 

-0.094 

6.209V 

0.600 

2.500 

3.687 

6.573 




How obtained 

-0.111 

-0.136 

0.401 

-0.466 I(—0.6106) 

0.071 

-0.628 

0.663 

0.486 V (2) + (5) 

-0.072 

-0.087 

0.258 

-0.293 I (—0.3929) 

0.046 

-0.403 

0.425 

0.311 II (+0.6415) 

0.474 

2.010 

0.589 

5.227V (3) + (6) + (7) 

0.043 

0.052 

-0.155 

0.176 1(4-0.2354) 

-0.002 

0.017 

-0.018 

-0.013 II (—0.0276) 

-0.033 

-0.142 

-0.042 

-0.369 III (—0.0706) 

0.508 

2.427 

3.472 

6.367 V (4) + (8) + (9) 




+ (10) 




Subst. from (11) & (12) 




into I 




Subst. from (11) 




into II 




III +2.010 

-0.059 



I(—0.3221) 

-0.005 



II (-0.0725) 

-0.112 



III(—0.2358) 

0.324 



(4)+ (14) + (16) 




+ (16) 


-0.087 


I(—0.3929) 


-0.403 


II(+0.6416) 


-2.010 


III (— 1) 


0.000 


(4) + (17) + (18) 




+ (19) 



-0.764 

I(+1.1623) 



-0.449 

II (-0.6772) 



-0.173 

III(—0.2930) 



2.302 

(4) + (20) + (21) 




+ (22) 


and C® columns are understood to apply all the way down.) 





84 


CONDITIONS WITHOUT PARAMETERS [Sec. 35] 


Vq = \\Fq1 -f- X 2 Fq 2 -f* X 3 Fq^ — 0.0000582 radian 
= 0.20 min. 

Vr — Xi Fr^ X 2 Fr 2 + X 3 Fr? — 0.0000214 radian 

= 0.07 min. 

10 8 

V p = — (XiFp 1 + \ 2 F p 2 + X 3 F p 3 ) = 0.25 ft. 

10 8 

v q = — (XjF , 1 + X 2 F ? 2 + X 3 F 9 3 ) = -0.33 ft. 

10 s 

V r = — (MFr 1 + X 2 F r 2 + X 3 F r 3 ) = 0.08 ft. 

for the six residuals. It is important to note that the numerical 
values of the derivatives required here are already worked out in 
Table 1, page 78. 

3°: By using these residuals with Eq. 6 , page 52, we find that the 
adjusted value of 

Angle P is 51° 06'.25 + 0'.03 = 51° 06'.28 
Angle Q is 95° 04'.5 - 0'.20 = 95° 04'.30 
Angle R is 33° 49'.5 - 0'.07 = 33° 49'.43 
Side p is 1723.7 - 0.25 = 1723.45 ft. 

Side q is 2205.4 + 0.33 = 2205.73 “ 

Side r is 1232.7 - 0.08 = 1232.62 “ 

Remark. Perfect closure (third condition on p. 76) may be 
secured by lowering angle R by the trifling amount O'.Ol; the value 
33° 49 r .42 so obtained, along with the other adjusted angles and 
sides just written, will satisfy also the first and second conditions on 
page 76 to within 1 part in \ million, which is about all we should ask 
for. Whenever, as happened here, one or more of the conditions 
fails owing to cumulated inexactness of rounding off, the computer 
is at liberty to manipulate the terminal figure of one or more of the 
residuals, raising or lowering it a unit or so to force the conditions. 

If not inconvenient, he will ordinarily (as was just done here) 
select the quantities of least weight for any such manipulations. 

The amount involved will be small compared with the standard 
errors of the final results (cf. also p. 229). 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 85 

4°: The weights and the standard errors of the three 0 functions 
(p. 77) are found as follows: 

From Row IV 1 the weight of the adjusted angle P is 
1/0.324 • 10“ 6 • 10+ 6 = 1/0.324. In other words, 0.324 is the 
variance coefficient of angle P. Then with a 2 = 4.23 • 10~ 8 
(p. 76), it turns out that the standard error of the adjusted 
angle P is (4.23 • 10"~ 8 • 0.324)* = 1.2 • 10~ 4 radian = 0.40 min. So 

Angle P = 51° 06'.3 ± 0'.4 

From Row IV 2 the weight of the adjusted sum of P + Q + R is 
1/0 or oo, as predicted. Hence the sum of the adjusted angles 
would be written 


P + Q + R — 180° absolutely 

From Row IV 3 the weight of the area \pq sin R is 1/2.302 X 10 12 . 
Its standard error is therefore (4.23 X 10 -8 X 2.30 X 10 12 )* = 
312 square feet; therefore the adjusted value of 

The area is 1058028 ±312 sq. ft. 

The area would better be written (105803 ±31) X 10 square 
feet, since not more than two figures of the standard error could be 
assumed known. In acres, 

The area = 24.2890 ± 0.0072 acres 

The area is found by using the adjusted values of p, q, and R and 
taking \pq sin R. Of course one could as well use \qr sin P or \pr 
sin Q for the area; one is as good as another. 

Exercise 1. Prove by Eq. 9, page 40, that after adjustment the 
weight of the area is a little more than double its weight before 
adjustment. 

Hint: By using Eq. 9, page 40, we find that 

1 2 f 1_. 1 , COt 2 

- = area* 5 \-x -b -»-1- 

l P Wp Q Wq w R 

= 1.12 X 10 12 {1.37 + 1.07 + 2.24| 

= 5.25 X 10 12 before adjustment 




86 


CONDITIONS WITHOUT PARAMETERS [Sec. 36] 


Therefore 

WW ■= 0.19 X 1CT 12 (before) 

We had 

Warea == weight of G 3 = 0.43 X 10” 12 (after) 

The result stated follows at once. 

Exercise 2. (From L. D. Weld’s Theory of Errors and Least 
Squares , Macmillan, 1916.) Take the line AB y on which are lo¬ 
cated points C and D, The whole line and its segments are 
measured with the same rule under similar conditions, the results 
being 

X\ = AC = 45.10 cm., mean of 2 observations 
X 2 = AD = 77.96 " “ “ 3 

X 3 ~ CD = 32.95 “ “ “ 2 

X 4 = CB = 98.36 “ “ “ 3 “ 

X 5 = DB = 65.55 “ 11 11 2 

X 6 - AB = 143.55 “ “ “ 4 

A COB 

Fig. 13. The line and its segments, corresponding to Exercise 2. 

Problem. Find the least squares values of the lengths. 

Take w\ = 2, w 2 = 3, w 3 = 2, w 4 = 3, w 5 = 2, w 6 — 4. 

Conditions: 

F 1 = x i + x 3 + x 6 — x 6 = 0 

F 2 = aq — x 2 + x 3 =0 

F 3 = X3 — x 4 + = 0 

Show that the normal equations are as follows. 


Row 

Ai 

x 2 

X 3 = 

= 1 

Sum 

I 

21 

12 

12 

60 • 10~ 2 

105 

2 


16 

6 

108 

142 

3 



16 

168 

202 

4 




0 

328 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 87 
Solution: 

\i = -0.1145, X 2 = +0.0952, X 3 = +0.1552 
Residuals: 

Vi = -0.0096 cm. 

V 2 = -0.0317 “ 

7 3 = +0.0680 (by applying 

7 4 = -0.0517 “ E + 12 > P- 54) 

7 6 = +0.0203 “ 

Vq = +0.0286 “ 

Adjusted values: 

AC = 45.110 cm. 

AD = 77.992 “ 

CD = 32.882 “ 

CB = 98.412 “ 

DB = 65.530 “ 

AB = 143.522 “ 

(AB actually turns out to be 143.521 cm., but the last decimal is 
raised one unit to satisfy the first condition. The other two con¬ 
ditions are satisfied perfectly by the adjusted segments.) 

Exercise 3. (a) By Row IV in the solution of the normal equa¬ 

tions of the preceding exercise, the minimized value of £ wV 2 is 
0.0246. 

(6) Find £ wV 2 by direct computations, using the values V\, 
V 2 , etc., found in the solution. Ans. 0.0246. 

Exercise 4. Find the standard errors of AB and AD, taking the 
standard error a of a single measurement to be 0.05 cm. 

Exercise 5. (a) Show that the estimate of <r made from £ w7 2 

is <x (ext) = 0.09. 



88 


CONDITIONS WITHOUT PARAMETERS [Sec. 35] 


(6) Show that, with a = 0.05, x 2 = about 10, and P(x 2 ) = 0.02, 
wherefore we might say that the discordance between the observed 
lengths of the segments is somewhat larger than one might expect 
from previous experience. 

Note: Since the individual measurements were not recorded, 
there is no possibility of estimating a from the original observations; 
i.e., we have no <r(int) to compare with the prior a and a (ext). 

Exercise 6 . The three inside edges of a rectangular parallele¬ 
piped are measured with calipers and a linear scale; and the 
volume is measured in cubic units by filling it with mercury, which 
is afterward poured into a graduated cylinder. The results of a 
set of observations are as follows: 




Mean 

n 

Standard 

deviation 

On edges parallel to the 
On edges 

x direction, 

Xi (cm.) 

Til 

si (cm.) 

y direction, 

* 2 “ 

ri2 

«< 

s 2 

On edges “ “ 

z direction, 

a 3 41 

ni 

<< 

S3 

On the volume, 


X 4 (cc.) 

n\ 

34 (CC.) 


If randomness has been demonstrated, one may pool the 
standard deviations of the measurements on the three sides to 
get an estimate of the standard error of a single observation on a 
linear measurement. If a denotes the standard error of a single 
linear measurement, then one would write 


<r i 2 (est’d) = 


n isi 2 + rc 2 s 2 2 + ft3$3 2 
n\ -f- n% -(- n3 ~ 3 


(Cf. Eq. 67 in Deming 
and Birge, cited on 
p. 29.) 


If n\ 4* n 2 + ^3 is fairly large (20 or 30), this estimate will be 
reliable enough. For <j 4 , the standard error of a direct determi¬ 
nation of volume, one would likewise write 


<j 4 2 (est’d) 


n 4 S4 2 
7*4 — 1 


If n 4 is as large as 20 or 30, this estimate will also be reliable 
enough. After obtaining estimates of and 0-4 one would 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 89 


assign weights to the observations X h X 2 , X 3 , and X 4 as follows 
(see Eqs. 18, p. 26): 


Wi — 


w 2 = 


niff* 


n 2 ff 

(Tl 2 


Wz 


n 3 ff^ 

* 1 2 

n 4 ff 2 

0-4 2 


< 7 2 , as in Section 11, is an arbitrary factor of proportionality, the 
standard error of observations of unit weight. If it is set equal 
to ffi 2 , we should have the convenient system of weights, 

n 4 ffi 2 

wi, w 2 , w 3 , Wa = wi, n 2 , n 3 , —— 

ff 4 2 


The weights having been settled on, we can proceed. The 
one and only condition on the adjusted values is that 


whence we put 


Xi = X1X2XZ 

F = X\ — xix&z 


Suppose we need the standard error of the volume after 
adjustment; we set 


G — Xa 


(a) Show that the one and only normal equation is LX = F 0 , 
whence X = F 0 /L, where 


L = X 4 2 


1 


+ 


1 

X 2 2 w 2 


+ 


1 

X 3 2 w 3 



(b) In tabular form, the normal equation for finding X, S, and the 
weight of the adjusted volume, is as follows: 


Row 

X 

= 1 

C 

I 

L 

F 0 

1/WA 

2 


0 


3 


—F 0 F o/L 


II 


— F qFq/L 


4 



— 1 / Lwa 2 

II 1 



(l/iy 4 )(l - 1 /Lwa) 





90 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 36] 


(c) The weight of the adjusted volume is (l/w 4 )(l — l/w 4 L). 

(i d) (The standard error of the adjusted volume) 2 = <j 2 (1/w 4 ) 
(1 - 1M L). 

This standard error is smaller than the standard error of the volume 
before adjustment by the fractional amount 1/Lw 4 2 . 


( e ) The minimized sum of the weighted squares of the residuals, 
S, is F 0 \. 

(/) The estimate of a 2 by external consistency (Sec. 13) is 


ir 2 (ext ) 


F qX 
4-1 


(i g ) What would you say if a 2 (ext) were much larger than your 
assumed value of <r 2 , i.e., P(x) small? 

Suggestions: Edges not parallel; lack of perpendicularity; measure¬ 
ments not so good as initially supposed (i.e., <n or <n too small); 
just happened to be so. 


( h) Show that after adjustment the standard error of the first 
edge is _ 


Jl l-Xj 
\w l X\L 


36. Shorter method of computing the weights of a large number 
of functions. 2 The theory on which the weights of the three G 
functions were calculated in Sections 34 and 35 rests on the fact 
that 3 

1 

(wt. of G) ~ 



2 To be omitted on first reading; the suggestion is that the reader return to 
this after a study extending through Section 61. 

8 Gauss, Theoria Combinationis (cited in Sec. 13), Art. 29. 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 91 


where B\ B ff , and B ,n satisfy the equations 


Bn B' + L\2B n + L\zB ,n — 
l 21 b' + l 22 b" + l 23 b"' = . 

L ai B' + L 32 B" + L 33 B"' = 


(17) 


In other words, the auxiliary constants B f , B n , and B ,n will 
satisfy the normal equations (Eqs. 2, p. 73) if the “ C ” column 



replaces the “ 1 ” column. 

One may, if he chooses, solve for the Lagrange multipliers, and 
any set of auxiliary constants B', B", B ,n as well, by first of all cal¬ 
culating the reciprocal matrix 



Cll 

C 12 

Cl3 

A" 1 = 

C 21 

C 22 

C23 


C 31 

C 32 

C 33 


(See Exs. 2, 4, and 
5 of Sec. 61.) 


( 18 ) 


and then using it to calculate the Lagrange multipliers and the 
auxiliary multipliers in the manner following— 

Xl = Ec^Cn + Fq 2 C\2 + Eo 3 Ci3 ] 

\2 — F o 1 C21 + F o 2 C22 + F Io 3 C23 

X 3 ** F 0 1 C31 + F 1 o 2 C32 + F o 3 C33 


> 


( 19 ) 



92 


CONDITIONS WITHOUT PARAMETERS [Sec. 36] 


B’ = 

~F l G~ 

w 

c 11 + 

~F 2 Q~ 

w _ 

C12 + 

■ F 3 G ' 

w 

Cl 3 

B" = 

n 

C21 + 

u 

C22 H- 

i 

M 

_ _j 

C 2 3 

B'" = 

ti 

C 31 + 

u 

C32 + 

>] 

C33 


( 20 ) 


The Lagrange multipliers (X), after being calculated from 
Eqs. 19, are used in Eqs. 12, page 54, to compute the residuals Vi, 
• • •, V n , just as was done on page 81. The auxiliary constants B', 
B", and B ,n from Eqs. 20 are used in Eq. 16 to find the weight of 
the function G. It will be noticed that the coefficients multiplying 
the c coefficients in Eqs. 19 will already be available from the first 
step, outlined on page 70 and carried out numerically on page 77. 
The brackets in Eqs. 20 arise by cumulating squares and cross- 
products from Table 2 of the fourth step (pp. 72 and 79; summed 
numerically in Table 3 on p. 80). It is not difficult to extend 
Tables 2 and 3 to take account of a new G function any time it is 
desired to introduce one. 

The work then proceeds rapidly, the reciprocal matrix A*” 1 
being used over and over in Eqs. 20 for all the G functions. If one 
is working with a fairly good-sized number of G functions, this 
scheme will save considerable time over the direct computation 
illustrated in Section 34. 

A distinct advantage of using the auxiliary multipliers is that 
the reciprocal matrix, once computed, is ready for use any time a 
new C column is produced, whereas, with the direct solution in 
Section 34 it is no little trouble to introduce a new C column after a 
solution has once been carried through. 

The three G functions used in Section 34 will serve for an illus¬ 
tration. To calculate the reciprocal matrix A” 1 we take the coeffi¬ 
cients of the unknowns in the normal equations on pages 82 and 83, 
and put the unit matrix on the right of the equality sign, thus 
starting off with the equations 

0.565 • 10 6 z + 0.345 • 10 6 y + 0.222 • 10 3 z - 1, 0, 0 ) 

0.345 • 10 6 z + 1.190 • 10 6 y - 0.492 • 10 3 z = 0, 1, 0 > 

0.222 • 10 6 x - 0.492 • I0*y + 2.500 • 10 3 z = 0, 0, 1 J 


( 21 ) 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 93 

The letters x , ?/, z designate the three unknowns that are to be 
solved for. Since there are three constant columns on the right, 
there will be three different solutions. The simplest way to 
obtain them would be to follow the regular routine for solving 
normal equations, as illustrated in Sections 34 and 61. How¬ 
ever, one unfamiliar with that procedure may make three sepa- 

1 

rate solutions. First, one would use the constant column 0. 

0 

By any method of solution whatever he would obtain 
x = 2.458 X 10 6 

y = -0.875 X 10* 

0 = -0.391 X 10 8 


Second, he would use the constant column 1 and find 

0 

a; = -0.874 X 10 6 
y = 1.226 X 10 8 

2 = 0.319 X 10 8 


0 

Third, he would use the constant column 0 and find 

1 

x = -0.390 X 10 8 
y = 0.319 X 10 6 

z = 0.498 X 10 8 


The reciprocal matrix is simply a convenient way of filing these 
results systematically. It is written like this: 


A” 1 = 


2.458 • 10 6 - 0.874 • 10 6 - 0.390 • 10 3 

-0.875 • 10 6 1.226 • 10 6 0.319 • 10 3 

-0.391 • 10 3 0.319 • 10 3 0.498 


( 22 ) 


The occasional failure of symmetry in the third decimal place 
comes from not carrying more figures; but what we have is good 
enough. Supposing that the Lagrange multipliers have not been 
worked out, we should next compute them from Eqs. 19 as follows: 

Xx = — 0.133-2.458+0.0540.875—0.0730.391 = —0.309) 

X 2 = + “ 0.874- “ 1.226+ “ 0.319= 0.073 > 

10 3 X 3 = + “ 0.390- “ 0.319+ “ 0.498= 0.071 j 


( 23 ) 



94 CONDITIONS WITHOUT PARAMETERS [Sec. 30] 

These agree well enough with the values —0.308, 0.073, and 0.071 
already found in Section 35 (conclusion 2°, p. 81). 

The chief aim at present is to compute the auxiliary constants 
B', B", B"' for each of the three G functions of Section 34. Going 
back to Table 3 in Section 34 for the coefficients needed for Eqs. 20, 
we find that 

ForG 1 

B' = 10 3 ) + 0.182• 2.458 - 0.182 • 0.875 - 0.500 • 0.3911 ’ 

= 0.0933 • 10 3 

B" = 10 3 {- 0.182-0.874 + 0.182• 1.226 + 0.500-0.319) I .... 

= 0.224 • 10 3 j (24) 

B'" = - 0.182 • 0.390 -f 0.182 • 0.319 + 0.500 • 0.498 

= 0.236 

These values used in Eq. 16 give 

— of G l = 0.500 - 0.182 • 0.0933 - 0.182 • 0.224 - 0.500 • 0.236 
wt. 

= 0.324 (25) 

That is, the weight of G 1 = 1/0.324, in agreement with conclusion 
4° in Section 35, page 85. 

ForG* 

B' = 10 3 {+ 0.222 • 2.458 + 0.492 • 0.875 - 2.500 • 0.391) ' 

= 0.00133 • 10 3 

B" = 10 3 { — 0.222 • 0.874 - 0.492 • 1.226 + 2.500 • 0.319) I . . 

= 0.00028 • 10 3 j 

B'" = - 0.222 - 0.390 - 0.492 • 0.319 + 2.500 • 0.498 

= 1.0015 

These used in Eq. 16 give 

— of G 2 = 2.500-0.222 • 0.00133 - 0.492 • 0.00028 - 2.500• 1.0015 
wt. 

= -0.004 (27) 

Since weights can not be negative, we may suppose that this 
negative result arises from not carrying enough figures. The low- 



[Ch. VI] COMPUTATION FOR GEOMETRIC CONDITIONS 95 


est possible result, if all figures had been carried, would be 0. 
Since we know what the result ought to be, we shall call it 0, 
whereupon the weight of G 2 is infinity, as is already known (con¬ 
clusion 4°, Sec. 35, p. 85). 

ForG 3 

As an exercise, the student should calculate B', B ", and B ,n for 
G 3 in like manner, obtaining 

B' = -1.808 • 10M 

jB" = 0.865 • 10 9 l (28) 

B m = 0.293 • 10 6 J 

whereupon 

— of G 3 = 10 12 {3.687+0.657 • 1.808+0.262 • 0.865-0.094 • 0.293) 
wt. 

= 2.300 • 10 12 (29) 

in agreement with conclusion 4° in Section 35 (p. 85). 

Remark. The number of auxiliary constants B', B", B ,f \ 
etc., in Eq. 16 is equal to the number v of conditions, i.e., the 
number of F functions. This is also the number of Lagrange 
multipliers (X), the number of equations in Eqs. 21, and the 
order of the reciprocal matrix.. In contrast, the number of G 
functions whose weights are wanted may be any whatever, 
smaller or larger than the number of F functions. 



CHAPTER VII 


ADJUSTING SAMPLE FREQUENCIES TO 
EXPECTED MARGINAL TOTALS 

37. Statement of the problem. In social and economic surveys 
that are carried out by sampling, it is sometimes desirable to 
adjust the sample frequencies, or to adjust certain sample ratios, to 
make them agree with certain corresponding totals or ratios that 
are known from other sources. This happens, e.g., in the work of 
the Census: there is a complete count of certain characteristics 
for the individuals in the population, but in consideration of 
efficiency in time and costs, data on some characteristics are 
collected on a sample basis in the first place, and the tabulations of 
these sample data need to be adjusted to the complete count. 
Moreover, many of the cross-tabulations or joint distributions of 
population characteristics that have been obtained on a complete 
count are limited to a sample when the data are processed in the 
Washington office, and these cross-tabulations likewise need to be 
adjusted. The sample, except in extremely fine classifications, is 
entirely adequate for purposes of action (the only purpose of taking 
any survey in the first place). The data of the sample are usually 
published as estimates of what would have been obtained by 
tabulating the characteristics for the entire population instead of 
only a sample thereof. This means that the sample is to be 
adjusted to certain totals that are known from other sources (as a 
complete count). 

The situation may be as shown in Fig. 14 in parallel tables for 
the universe and for the sample. For the universe, the marginal 
totals Ni. and N.j are known from the complete count, but not the 
individual cell frequencies Nij ; for the sample, however, tabula¬ 
tion gives both the sample marginal totals and n.j, and the sam¬ 
ple cell frequencies n t; . After adjustment, the marginal totals of 

96 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


97 


the (adjusted) sample and complete count will agree. The prob¬ 
lem is to write in the cell frequencies of the universe, with the aid 
of the sample, preserving the marginal totals that are fixed by 
knowledge of the universe. It is the object of this chapter to show 
schemes for performing such adjustments. 

UNIVERSE SAMPLE 


j= 


t 2 S 



Njj unknown 

Marginal totals Nj and N ^ known 
N known 


J= 


I 2 S 



n^ known 

Marginal totals n j and n j known 
n known 


Fig. 14. Showing the system of notation for the cell frequencies and mar¬ 
ginal totals of the universe and the sample in the two-dimensional problem. 


38. Cell frequencies and sampling errors. A statistical table 
shows the frequencies of occurrence of the various members of 
subclasses within a population or universe, and is made up of cells, 
one for each subclass. A two-dimensional universe is formed by 
the crossing of two classifications, as depicted schematically in 
Fig. 14. An example is contained in Table 1, page 107. The 
title of the table ordinarily describes the universe. The box head¬ 
ings over the columns define various mutually exclusive classes 
according to one system of classification, and the stub does likewise 
for some other system of classification. A member of the universe 
will belong to one of the classes that are defined in the heading, 




98 


CONDITIONS WITHOUT PARAMETERS [Sec. 38] 


and at the same time it will belong to one of the classes that are 
defined in the stub; it is said to be a member of the subclass that is 
defined by the combination or cross-classification of two particular 
classes described respectively in heading and stub. That is to 
say, a member of the universe must lie in one column or another, 
and at the same time it must lie in one row or another; in the table 
it lies in the space common to a particular column and a particular 
row. This space is called a cell. The number written in the cell 
is a cell frequency and it shows how many members of this particular 
subclass were recorded in the enumeration of the universe, by 
sample or complete count. For instance, in Table 1, in the cell 
designated by the combined ages 14 and 15 (shown in the heading), 
for the state of New Hampshire (shown in the stub), is recorded 
a sample frequency of 395. When the tabulation is prepared by 
crossing three classifications, the result is a three-dimensional table. 
A three-dimensional table is usually printed as a set of two- 
dimensional tables, rather than as a single table. These single 
two-dimensional tables all show the same heading and stub, and 
each one represents the members of one class of the third classifi¬ 
cation, as the heading will show. A three-dimensional universe is 
depicted schematically in Fig. 15. Similarly, one may have four-, 
five-, or n-dimensional tables. The sum obtained by adding the 
frequencies of an entire row or column is a marginal total or rim 
total , although it could well be called a class total. 

When the data for the table arise from sampling, the frequencies 
(numbers) obtained are smaller than if the coverage had been com¬ 
plete. For instance, if the sample is a so-called 5 percent sample, 
the numbers in the table will be only about 5 percent of what 
they would have been had a complete count been taken. It is 
not possible to perform the sampling in such a way that the sample 
frequencies are exactly 5 percent of what would be obtained on 
a complete count. If the sampling were so carried out, it would 
be sufficient merely to multiply every cell frequency by 20, and 
every marginal total also by 20. (The number 20 is spoken 
of as the sampling ratio, the reciprocal of 5 percent.) But be¬ 
cause of sampling errors, and possibly also because of certain biases 
that inevitably enter any survey, the sample frequencies will 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


99 


not be just l/20th of the frequencies that would be shown by a 
complete count. For the convenience of the user of a table, these 
sample frequencies are sometimes adjusted to some or all of the 



Fig. 15. Showing the system of notation for the cell frequencies and mar¬ 
ginal totals in the three-dimensional sample. The cell shown shaded is desig¬ 
nated by the indices ijk. The sample frequency falling in this cell is n x; *. The 
corresponding adjusted deflated frequency is and the adjusted inflated 
frequency is Mijk . Some of the tube and slice totals are indicated. 

marginal totals that happen to be known from other sources, as 
by a complete count. This is a convenience to the user, because 
after adjustment the identical marginal totals are found in tables 











100 CONDITIONS WITHOUT PARAMETERS [Sec. 39] 

having the same marginal specifications. Without adjustment, 
the frequencies might be alike enough for purposes of action, but 
perhaps not close enough for identification. The known marginal 
totals that are used in an adjustment are spoken of as controls , 
or control totals. 

The adjustment is more than a convenience to the user; it 
diminishes the sampling variance to some extent; the more con¬ 
trols the smaller the sampling variance of the adjusted frequencies. 
(See the exercise at the end of Ch. V.) As a practical matter, 
however, this diminishing of the sampling variance should not be 
overemphasized, because biases and other difficulties may have a 
much greater effect than the sampling errors. 

In the work of the Census not all sample tables arc adjusted to 
all the known marginal totals. Even with the short cuts that will 
be described here, and which are more fully described elsewhere, 1 ' 2 
it may be more important to publish the table at once, after merely 
multiplying the sample results by the sampling ratio, rather than 
to wait for adjustments to be made. One of the main advantages 
of sampling is quick processing, and this is particularly important 
for government planning in times of economic and social stress, in 
which the delay of only the brief time required for adjustment may 
not be advisable. 

39. Nature of the adjustment. It will perhaps be realized by 
now that the problems to be taken up in this chapter are similar 
to the geometric ones in the last two chapters — similar in that 
the conditions imposed on the adjusted values are rigorous, not 
involving adjustable parameters. The same procedure for enforc¬ 
ing the least squares criterion will be found to give us an answer in 
this problem, as it did in the geometric problems. Here, however, 
short cuts will be described, which will greatly diminish the amount 
of computational labor and expense. 

1 W. Edwards Deming and Frederick F. Stephan, “On a least squares 
adjustment of a sample frequency table when the expected marginal totals 
are known,” Annals of Mathematical Statistics , vol. XI, No. 4, December 1940: 
pp. 427-444. 

3 Frederick F. Stephan, “ An iterative method of adjusting sample fre¬ 
quency tables when expected marginal totals are known,” Annals of Mathe¬ 
matical Statistics, vol. XIII, No. 2, June 1942: pp. 166-178. 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


101 


40. A closer look at the problem. In estimating any cell fre¬ 
quency of the universe, such as Nij, three possibilities present them¬ 
selves: from the sample one may make an estimate from the 
sampling ratio of the zth row alone, another from the sampling 
ratio of the jth column alone, and still another from the over-all 
sampling ratio N/n. Specifically, the three estimates would be 
UijNiJrii ., UijN.j/n.j , and riijN/n. These being simple multipli¬ 
cations of the observed cell frequency by three sampling ratios, 
viz., the sampling ratio Ni./n-i . in the zth row, N.j/n.j in the jth 
column, and the over-all sampling ratio, N/n. Because of sampling 
errors, these three adjustments will not be identical except by 
accident, and though any of them by itself may be considered 
accurate enough, still, if the whole r X s table of universe cell fre¬ 
quencies were estimated by any one of these three adjustments, the 
marginal totals would not come out equal to the known values. 
This chapter presents three rapid methods of adjustment, which in 
effect combine all three of the estimates just mentioned, and at the 
same time enforce agreement with the marginal totals. These 
methods can be extended to varying degrees of" cross-tabulation in 
three dimensions. 

Any method of adjustment must provide as its end product a set 
of adjusted frequencies that will satisfy the controls provided by 
the known marginal totals. In any problem of adjustment where 
the controls are intricate (many conditions), and where the adjust¬ 
ments are carried out by the hundreds and thousands, as they are 
in the Census, it is necessary to have a method that is straight¬ 
forward and self-checking; this is particularly important in three- 
way tabulations, where in one possible situation (Case VII in 
reference 1) the adjustment in one cell must be balanced by 
adjustments in at least seven others. It turns out, fortunately, 
that methods of the kind required in mass production can be 
devised (Secs. 45, 46, 48, and 49). 

41. The least squares requirement. By the method of least 
squares one would enforce the controls (conditions), and at the 
same time minimize the sum 

S = £ — (»n t - Hi ) 2 
n, 


(1) 



102 


CONDITIONS WITHOUT PARAMETERS [Sec. 42] 


rii stands for the observed frequency in the zth cell, and ra» the ad¬ 
justed sample frequency therein, rii is found in the sample survey, 
and rra arises in the adjustment. Here the denominator is 
taken as the reciprocal of the weight of the ith cell. The bigger 
the frequency, the bigger the average sampling error (absolute 
error, not proportionate error), and accordingly the smaller the 
weight. It might be argued that the weight should be taken 
inversely proportional to nti rather than rii, but, if the sampling is 
accurate enough for the purpose intended, it will make little 
difference which is used. Strictly, in random sampling, the 
reciprocal of the weight of is npijqij , which is nearly equal to 
rii/pn , where p and q have their usual connotations. But since 
factors proportional to the weights may be substituted for them, 
it is sufficient to use as the reciprocal of the weight in cell ij , 
since the values of g t -y do not usually vary much over the table. 
In stratified sampling, the weights are still closely inversely pro¬ 
portional to riij. 

42. The two-dimensional problem. Suppose that the data on 
two characteristics (e.g., age and highest grade of school com¬ 
pleted) are obtained for each member of a universe of N individu¬ 
als, but that tabulations of the complete data provide either 

Case /. Only one set of marginal totals, N\. f N 2 ., • • •, N r .' f 
or 

Case II. Both sets of marginal totals, viz., N 1 ., iVV, • • •, N r ., 
and N.i, N. 2 , • • • , N. 8 (See Fig. 14.) 

The nature of the tabulations is presumed such that it is not 
feasible (too expensive) to count the numbers Na in the cells, as 
would be done if one character were crossed with the other in 
tabulation. Suppose, however, that in a sample of n individuals 
selected in a random manner from the universe, the two characters 
are crossed with each other, so that not only all the s + r marginal 
totals n.i, • • •, n r . of the sample are known, but also every one of 
the numbers mj(i = 1, 2, • • •, r; j = 1, 2, • • •, s). The problem 
is to estimate the unknown frequencies Nij in the cells of the uni¬ 
verse. This will be done by first finding the calculated or adjusted 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


103 


sample frequencies m^ and then inflating them by the inverse 
sampling ratio N/n. 

For the least squares solution we seek those values of m t j that 
minimize 3 

<s = L — (mu - Uij) 2 (2) 

Tin 

wherein the are subjected to conditions of Case I or Case II. 

Case I: One set of marginal totals known. Assume N i., AT 2 ., • • •, 
N r . to be known. Then we require the marginal adjustments 

2- = mi. i = 1, 2, • • •, r (3) 

j 

These r equations constitute r conditions on the adjusted ra,;, 
corresponding to Eqs. 3 of Chapter IV, page 50. Assuming that 
the adjusted values of the mij have been found, let each take on a 
small variation 8mij\ then the differentials of Eqs. 2 and 3 show that 


\ dS - £ —-— 8mij = 0 (one equation) (4) 

Kij 

2^ hmij = 0 i = 1, 2, • • • , r (r equations) (5) 
i 

Multiply now Eq. bi by the arbitrary Lagrange multiplier — X t -, 
and add Eqs. 4 and 5 to obtain 


_ x 1 

nij J 


5w»y = 0 


(one equation) 


( 6 ) 


By the same argument that was advanced in Section 27, page 54, 
one may now set each brace equal to zero. The r Lagrange mul¬ 
tipliers are then no longer arbitrary, but each must satisfy the 
resulting relation 

niij = Wi;( 1 “h ^») (7) 

8 The sign ^ will denote summation over all possible cells, unless otherwise 
noted. 2 w ifl denote summation over all values of i, and similarly for an 

t 

inferior.; or k. The dot in n.,- will signify the result of summing the over 
all values of i in the jth column. 



104 


CONDITIONS WITHOUT PARAMETERS [Sec. 42J 


The adjusted frequencies can be computed at once as soon as 
the are found. To evaluate them one may rewrite the con¬ 
ditions (3) using the right-hand member of Eq. 7 for obtaining 

rrii, = rti.il + \) ( 8 ) 

Another way to arrive at this same relation is to sum each member 
of Eq. 7 in the ith row. However obtained, is now known, since 
mi. and n,*. are known, and in fact Eq. 7 now reduces to 

m{j — 7i ij ( 9 ) 

7l\. 

The adjustment is thus a simple proportionate one by rows, the 
cells in any one row all being raised or lowered by the proportion¬ 
ate adjustment in the row total. Case I thus amounts to r inde¬ 
pendent one-dimensional proportionate adjustments, one for each 
row; and any one or all may be carried out, as desired. This 
result can be obtained by a simpler approach but is presented in 
this way for consistency with later cases. 

The minimized sum of squares may be computed directly, or 
from the row totals by seeing that 

s = r — K-. - m .) 2 (10) 

% rii. 

The term (ra*. — ni.) 2 /rii . for the ith row may be considered sepa¬ 
rately, and used as x 2 with s — l degrees of freedom, or all rows 
may be combined into the minimized S as given in Eq. 10, and used 
as x 2 with r(s — 1) degrees of freedom. 

Case II: Both sets of marginal totals known. Here the adjusted 
cell frequencies must satisfy not only conditions (3) but also 

L rtiij = m.j j = 1, 2, • • • , s - 1 (11) 

X 

there being now a total of r + s — 1 conditions. In both cases, 


m*. = N im - 
N 


m.j = N.j 


N 


( 12 ) 


(13) 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


105 


In other words, and m.j are the deflated marginal totals, i.e., 
N{. and N.j divided by the actual sampling ratio N/n. The rm. 
and m.j are not independent, because 

N.t + N. 2 + • • • + N. a = N lt + N 2 . + • • • + N r . = N (14) 

It is for this reason that if i runs through all r values in Eq. 3, 
then j can run through only $ — 1 in Eq. 11. A similar equation 


also exists for the marginal totals of the sample, namely, 

w.i + n. 2 + • * * + n. 8 = ni. + n 2 . + • • • + n r . = n (15) 

Solution of the two-dimensional Case II. In addition to Eq. 5 
we now have also 

L tony = 0 3 = 1, 2, • • •, s - 1 (16) 

t 

which comes by differentiating Eqs. 11. By addition of Eqs. 4, 5, 
and 16, after multiplying Eq. 5 i by — and Eq. 16 j by —fij, we 
obtain 

-X t - = Q (17) 

l Kij J 

Equating each brace to zero, as before, we find that 

mij — n{j(\ -j - \i -j- /iy) (18) 

wherein n 8 is to be counted 0. The adjustment is now no longer 
proportionate by rows, but involves every cell. 


To evaluate the Lagrange multipliers in Eq. 18 we may sum the 
two members downward and across in Fig. 14 and obtain the 
r + s — 1 normal equations 

ni.\ + E nijfjLj = mi. - m. i = 1, 2, • • •, r 

3 

L nij\i + n.jlij = m.j - n.j j = 1, 2 , • • •, s - 1 

» 

These can be reduced for numerical computation. The top 
row solved for X t gives 

X* = — {mi. - E nijfjLj] - 1 (20) 

n%. j 

whereupon by substitution into the bottom row of Eqs. 19 we 
arrive at the s — 1 normal equations. 




106 


CONDITIONS WITHOUT PARAMETERS [Sec. 43] 


Ml 

M2 

* * * 

- 1 

^ norm 

n.i-JL „ 
t ni. 

yi n% i?i *2 
t m. 

Ijp n il'^i,»—l 
t tt-i. 

^ nnmi. 
m. i - Z„ 

T ni. 

n. 2 

i ni. 

1 

x rn . 

^ numi. 

m.2 - 2* 

t ni. 




(21) 



n *- 1 ~ 2- 

t n i. 

< ni. 




0 


Because of symmetry in the coefficients, those below the diagonal 
are not shown, indeed, in the systematic computation already 
shown in Section 33 (p. 73), they are not used. The 0 in the 
bottom row is appended for the computation of the minimized S, 
if desired. The number of Lagrange multipliers to be solved for 
directly is s — 1, and the remaining ones come by substitution 
into Eq. 20, jjl 8 being counted 0. 

A simple procedure for calculating the coefficients in the normal 
equations (21) is to set up a preparatory table by dividing each 
riij in the tth row by \/ni .; also to write down rtii./y/rii. for that 
row, for use on the right-hand side of the normal equations (com¬ 
pare Tables 1 and 2). In machine calculation the constant 
divisor y/rii. would be left on the keyboard until the entire ith row 
is divided; or, if reciprocal multiplication is preferred, the multi¬ 
plier l/y/rii. would be left on the keyboard. From this prepara¬ 
tory table, the cumulation of squares and cross-products in the 
vertical gives the required summations for the coefficients. The 
sum check would be applied in the usual manner. 

43. A numerical example of the two-dimensional Case n. The 
fact is that in practice one need not bother about forming and solv¬ 
ing the normal equations because they will be displaced by a 
simplifying iterative procedure, to be explained in a later section. 
For illustration, however, we may do an example both ways, first 
using the normal equations and the adjustment (1S)> later on 
accomplishing the same results by the quicker method. 

We may start with the unitalicized numbers in the 4X6 array 







[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


107 


of Table 1, assuming these to be the sampling frequencies n f -y to be 
adjusted. Actually, they were obtained by deflating l/20th (for 
a supposed 5 percent sample) the New England age X state table 
on p. 1108 of vol. 2 of the Fifteenth Census of the U. S., 1930, then 
varying the deflated values by chance with Tippett’s numbers to 

TABLE I 

A TABLE OP SAMPLE FREQUENCIES, A 5 PERCENT SAMPLE OF NATIVE WHITE 
PERSONS OF NATIVE WHITE PARENTAGE ATTENDING SCHOOL, BY AGE BY STATE! 

New England, 1930 


(The adjusted, frequency mij in each cell is shown italicized just below 
the corresponding sample frequency njj) 


Age 

7 to 13 

14 & 15 

16 & 17 

18 to 20 




j = 

1 

2 

3 

4 

ni. 



H = 

0.0118 

0.0149 

0.0012 

0 

mi. 

State 

i 

Xi 






Maine 

1 

-0.0146 

3623 

781 

557 

313 

5274 




3613 

781 

550 

308 

5252 

New Hampshire 

2 

-0.0003 

1570 

395 

251 

155 

2371 




1688 

401 

251 

155 

2395 

Vermont 

3 

0.0234 

1553 

419 

264 

116 

2352 




1608 

435 

270 

119 

2432 

Massachusetts 

4 

-0.0162 

10538 

2455 

1706 

1160 

15859 




10492 

2452 

1680 

1141 

15766 

Rhode Island 

5 

-0.0230 

1681 

353 

171 

154 

2359 




1662 

350 

167 

150 

2330 

Connecticut 

6 

-0.0034 

3882 

857 

544 

339 

5622 




3915 

867 

643 

338 

6662 



n.j 

22847 

5260 

3493 

2237 

33837 



m.i 

22877 

5286 

3462 

2213 

33837 


The adjusted frequencies w,,- (italicized) are rounded off, hence when 
summed may occasionally disagree a unit or so with the expected marginal 
totals (also italicized). The latter arise by deflation from the universe rather 
than by direct addition of the X* and nj are found in the solution of 

Eqs. 20 and 21. 




108 CONDITIONS WITHOUT PARAMETERS [Sec. 43] 

get fictitious sampling frequencies n i; -. The italicized entries in 
Table 1 represent the final (adjusted) frequencies ra x y, and it is 
these that we now set out to get. We start off with the sampling 
frequencies n t y and the known marginal totals m.i, m. 2 , etc., where 
m x . = Ni.n/N, m.j = N./n/N, as in Eqs. 12 and 13. The 
Lagrange multipliers shown along the left-hand and top borders 
arise in the calculations now to be undertaken. 

TABLE 2 

Each sample frequency in table 1 divided by the corresponding \/n». 


This operation would ordinarily be done a row at a time. 



3 = 

mi./y/m. 

Sum 

1 

2 

3 

4 

2 = 1 1 

49.89 

10.75 

7.67 

4.31 

72.32 

144.94 

2 

32.24 

8.11 

5.15 

3.18 

49.19 

97.87 

3 

32.02 

8.64 

5.44 

2.39 

50.15 

98.64 

4 

83.68 

19.49 

13.55 

9.21 

125.19 

251.12 

5 

34.61 

7.27 

3.52 

3.17 

47.97 

96.54 

6 

51.77 

11.43 

7.26 

4.52 

75.51 

150.49 

Sum 

284.21 

65.69 

42.59 

26.78 

420.33 

839.60 


Table 2 is the preparatory table, advised at the close of the last 
section. It is derived from Table 1 by dividing the ith row of 
sample frequencies by For example, the entry 8.64 in the 

cell i = 3, j = 2 comes by dividing 419 by V2352, 419 being the 
entry in the cell of the same indices in Table 1, and 2352 being the 
sum of the third row. The sums at the bottom and right-hand side 
are for checking the formation of the normal equations. The 
cumulations of squares and cross-products along the vertical give 
the summations required for the normal equations (Eqs. 21), 
which now appear numerically as Eqs. 22. 



[Ch. VII] 

ADJUSTING TO 

MARGINAL 

TOTALS 

109 

Row 

Ml 

M2 

M3 

= 1 


I 

7413 

-3549 

-2354 

3197 X 10“ 2 


2 


4441 

-544 

2356 

(22) 

3 



3129 

-3222 


4 




0 



Performing the solution by any favorite procedure one will obtain 
Mi = 0.01182 M2 = 0.01490 M3 = 0.00119 (23) 

whereupon by substitution into Eq. 20 comes 

Xi = -0.0146 X 4 = -0.0162 <| 

X 2 = -0.0003 X 5 = -0.0230 l (24) 

X 3 = +0.0234 X 6 = -0.0034 J 

The next step is to compute the m t y by Eq. 18. Table 1 is now 
bordered with the Lagrange multipliers for a convenient arrange¬ 
ment of the factors required, and the calculation is completed. It 
will be noted that, for example, 

m 32 = 419(1 + 0.0234 + 0.0149) = 435 (25) 

The rriij thus calculated are shown italicized in Table 1. The 
marginal totals, found by adding the rriij just calculated, do not 
agree exactly everywhere with the expected totals, because of 
rounding off to integers: the errors of closure, however, are slight, 
and it is a simple matter to raise or lower some of the larger cells 
by a unit or two to force exact satisfaction of the conditions, if this 
is desired. (Compare with the triangle problem on p. 84.) 

44. The three-dimensional problem. Here the N cards of the 
universe are sorted and counted for one and perhaps a second and 
third characteristic, and possibly crossed by pairs in various 
combinations (Cases I-VII). The sample of n, however, is crossed 
by all three characteristics, which is to say that the cell frequencies 
riij k are all known (refer to Fig. 15). As before, the adjusted fre¬ 
quencies are required. 



110 


CONDITIONS WITHOUT PARAMETERS [Sec. 44] 


Case I: One set of slice totals known. Assume the slice totals 
Ni.. f N 2 ..) • * *, A r .. to be known; the conditions are then 

£ mijk = m».. = Ni.. i = 1, 2, • • •, r (26) 
3 k Jy 

being r in number. The summation to be minimized here is 

S = L — - ni,k)2 (27) 

n%jk 

being similar to that in Eq. 2, except that now there are three 
indices to be summed over instead of two. Following a procedure 
similar to that used before, we differentiate Eqs. 26 and 27 and 
introduce the r Lagrange multipliers X».. with Eq. 26. The steps 
are identical with those of the two-dimensional Case I, and the 
result is at once 


rm 

m ijk = n»; fc (l + A*..) = n ijk — 11 (28) 

ni.. 

This adjustment, like that shown by Eq. 9, is a simple proportion¬ 
ate one, but this time by slices rather than by columns. All cell 
frequencies having the same i index are raised or lowered in the 
same proportion. 

Case II: Two sets of slice totals known. Here, in addition to the 
slice totals of Case I we know also 

•••, N.s. 

whence arise the $ — 1 additional conditions 

2D m ij k = m.j. — N.j. ~f j = 1, 2, • • *, s — 1 (29) 

ik 1 V 

Using the Lagrange multipliers A .j. here, and A t -.. with Eq. 26 as 
before, we find that 

mij k = nij k {\ + A t \. + A .j.) (30) 

in which A . a . is to be counted zero. This adjustment is proportion¬ 
ate by tubes, the ratio mij k /nij k being constant along the ijth tube 
and in fact equal to mij./nn., independent of k. Unfortunately 



ICh. VII] 


ADJUSTING TO MARGINAL TOTALS 


111 


we do not here know the face totals m»y. and are unable to make use 
of the proportionality as we shall in Case IV. 

To solve for the r + s — 1 Lagrange multipliers we sum the 
members of Eq. 30 over j and then over i and arrive at the normal 
equations 

.. + £ = w»\. — ni.., i = 1, 2, • • *, r 

i 

L nij.K.. + n.j.X.j. = m.j. - n.y., j = 1, 2, • • •, s - 1 

i 

These can be reduced to s — I equations in precisely the same way 
that Eqs. 19 were reduced, but, because of the great advantage of 
the iterative process to come further on, we shall not pursue the 
reduction here. 

Case III: All three sets of slice totals known . All slice totals 

N.u, N.2., • • N.s. 

Nx.., N 2 .., ..., Nr.. 

N'.uN .. 2 , •••, N.. t 

now being known, in addition to conditions (26) and (29) we 
require here 

X} M'ijk = 7n..k — N..k ”, k = 1, 2, • • *, t 1 (32) 

ij iv 

which makes a total of r + (s — 1) + (t — 1) or r + $ + t — 2 
conditions. The same kind of manipulation as used heretofore 
gives 

mijk = Uijk( 1 + X,*.. + X.y. + X..*) (33) 

with X. a . and X..« to be counted zero. The adjustment is no longer 
proportionate by slices or tubes, but involves every cell. In 
practice, once the normal equations are solved and the Lagrange 
multipliers worked out, one proceeds very much as in the two- 
dimensional Case II: for each of the t slices, corresponding to the 
t values of k , there will be a two-dimensional adjustment, the 1 in 
Eq. 18 being replaced now by 1 + X..&. 

The normal equations for the Lagrange multipliers can be found 




112 


CONDITIONS WITHOUT PARAMETERS [Sec. 44] 


by performing double summations on Eq. 33. The result is 


+ Z = rn.. ~ rii.. 

3 k 

i = 1 , 2 , • • •, r 

L + n.j.X.j. + D n.j k \.. k = m.y. - n.y. 

* 

3 = 1> 2, • • •, s - 1 

£ Tii.kKi.. + 2 w.jifcVy. + n..kh..k = m..k — n..k 

i 3 

k = 1, 2, • • t - 1 


(34) 


If these calculations were to be carried out, one would simplify the 
computation by solving the top row for X;.., getting 


X,.. 


j k 


(35) 


and then substituting this into the middle and last rows of Eqs. 34 
to get a reduced set of s + t — 2 normal equations for the Lagrange 
multipliers X.y. and X..*, the numerical values of which when set 
back into Eq. 35 give the X z ... In all the summations of Eqs. 34 
and 35, X. 8 . and \.. t would be counted zero. But here again, the 
iterative process to be explained later will displace the use of normal 
equations, so actually we are not interested in reducing them. 

Case IV: One set of face totals known. It may be that the rs face 
totals 

*^11 •) Hl2-t ' * ’y N ij.j *, N r8 . 


are known from crossing the i and j characters in the universe. 
The conditions are then 


2 ] m ijk — n%ij, — Ntj. 
k 


n 

F 


i = 1, 2, 
3 = 1, 2, 


(36) 


The adjustment here turns out to be 


m^k = n ijk (i + Xy.) (37) 

but by summing both sides over the index k to evaluate X t y. it is 
seen that 


mu. = naX 1 + Xfi.) 


(38) 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


113 


whence 

mu. , . 

rriijk = riijk - (39) 

ftij. 

This adjustment is thus proportionate by tubes, like that in Eq. 30, 
though here the factor ntij./mj. is known and Eq. 39 can be applied 
at once. 

Case V: One set of face totals, and one set of slice totals known . 
Sometimes, in addition to the rs face totals of Case IV, the slice 
totals 

N.. l 9 N.. 2 r -;N..t 

will also be known, in which circumstances the conditions (36) are 
to be accompanied by 

Z m ijk = m.. k = N.. k ■“» k = 1, 2, •••,<- 1 (40) 

ij JS 

The same procedure as previously applied yields now 

^ijk — ^ijk (1 + Xij. + X..fc) (41) 

with \.. t to be counted zero. Summations performed over k, and 
then over i and j together, give the normal equations 

+ Z^h’i&V.fc = n%j . 

k 

Z nyjfcAtf. + k = m..k - n.. k 
ij 

The number of equations is rs + t — 1, since \.. t does not exist. 
As before, a simplification can be effected by solving the top row 
for \{j. and making a substitution into the lower one, but, because 
of the great advantage of the iterative process to be seen further 
on, we shall not carry out the reduction. 

Before going on it might be noted that although this case is three- 
dimensional, it reduces to the two-dimensional Case II if one 
considers that ij. is one index running through the values 11, 12, 
• • •, 21, 22, • • •, rs, and that ..k is a second index running 
through the values 1, 2, • • •, t. This can be seen by the similarity 
between Eqs. 42 and 19. 




114 


CONDITIONS WITHOUT PARAMETERS [Sec. 44] 


Case VI: Two sets of face totals known . If in addition to the 
face totals of Case IV, the face totals 

tf.11, -V. 12 , • • •, N.it 


are also known from further crossing the j and k characters in the 
universe, we shall require 


L mjk = rn. jh = N. jk ^ 


3 = 1, 2, 
k = 1 , 2 , 



(43) 


in addition to the conditions (36). In place of Eq. 39 of Case IV 
we now find that 


'fnijk = + X,j. + X.yi) (44) 

in which X.,< is to be counted zero for all j. No simple relation 
such as Eq. 39 is possible here, because the adjustment is not 
proportionate by tubes; the Lagrange multipliers must be evalu¬ 
ated. This can be accomplished by summing the members of 
Eq. 44 over k and i in turn, resulting in the normal equations 

^»;.X t j. -j- J* W'ijk^-jk ~ n%ij . 71 {j . 

k 

H + n.j k \.jk = m.jk — rc.y* 



Since X.y< does not exist for any values of j, the number of equations 
is rs + s(t — 1) = s(r + t — 1). They break up at once into 
s sets each of r + t — 1 equations, one set for every j value. In 
fact, the problem can be considered as s sets of the two-dimensional 
Case II. Any one value of j gives a slice, which can be looked 
upon as fulfilling the specifications of the two-dimensional Case II. 
Each set of normal equations can be reduced in the same manner 
that Eqs. 19 were reduced. 

Case VII: All three sets of face totals known. All totals now being 
known, we require 


^ Ar n 

'M'ijk — 771 ij . — N%j. > 

k N 

i = 1, 2, • 

j = 1, 2, • 

• •, r 

• *, s 

i 

j 

(36) 

Z mm = m. jk = N.j k 

II II 

J" 1 v^" 1, 

to to 

• s 

ij 

(43) 

— „ n 

Z ™ijk = m i h = Ni. k —> 
i N 

i = 1, 2, • 
k = 1, 2, • 

• r - 

ii 

ij 

(46) 





[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


115 


The adjusting relation is 

witifc ~ n ijkO- + + X.jfc + (47) 

in which X. ; * is to be counted zero for any j, \ r .k for any k, and 
\i.t for any i. The normal equations for the Lagrange multipliers 
are 

n tjk^.jk + £ nijkK.k — mij. — Uij. 

k k 

]L Tlijk^ij. + fl.jfcX.jfc -|- 22 Tlijk^i.k “ Wt.jfc 

* t 

22 ftijk^.jk “f~ ^t.k^i.k = Wt.Jfc 

; i 

being rs + r£ + sf — r — s — ^+ l in number. They can be 
reduced in the same way that previous normal equations have 
been reduced; but here again, the iterative process will render the 
use of normal equations unnecessary, except for theoretical pur¬ 
poses, e.g., justification of the iterative process. 

45. A simplified procedure — iterative proportions. The num¬ 
ber of Lagrange multipliers in any problem is equal to the number 
of conditions imposed on the adjustment (Sec. 27). Here the 
conditions have appeared in sets, depending on which marginal 
totals are involved. By a comparison of Eqs. 9 and 28 on the one 
hand, with Eqs. 18, 30, 33, 41, 44, and 47 on the other, we see that 
wherever there was only one set of marginal totals involved we 
came out with a simple proportionate adjustment, but that in all 
other cases it was not so; the Lagrange multipliers involved were 
unfortunately related to one another through normal equations. 

We need a simplification. It is a fact that as a first approxima¬ 
tion the adjustments may all be considered proportionate, in 
either the horizontal or the vertical. We shall be able to write 
down an expression for the error in this approximation, and shall 
be able to reduce it sufficiently by a succession of proportionate 
adjustments. 

Take the two-dimensional Case II for an example. In Eq. 20 

one may recognize (1 /n<.) £ riijUj as a weighted average of /i; for 

J 

the ith row. There will be a weighted average of nj for the first 
row, another for the second, etc., one for each value of t; conse- 




116 


CONDITIONS WITHOUT PARAMETERS 


ISec. 45] 


quently one may appropriately speak of the fth average of nj, 
writing it i-stv iij. Substituting from Eq. 20 into 18 one then sees 
the adjustment (18) appear as 

rrtij = riij + \x 2 - i-av (49) 

If, on the other hand, m had been eliminated from Eqs. 19, instead 
of Xi, the result would have been 

mu = riij -f X t - j -av X,^ (50) 

From either Eq. 49 or 50 it is clear why the adjustment (18) is not 
proportionate by rows or columns, and why Case II does not break 
up into r or s sets of Case I: the reason is that ixj in any cell is not 
necessarily equal to the average m; for that row, nor is X t in any cell 
necessarily equal to the average X z for that column. If nevertheless 
one were to make the simple proportionate adjustment 

/ rtij. 

rrtij = riij -— (51) 

rti. 

along the horizontal in the ith row, the horizontal conditions (3) 
will be enforced but not the vertical ones (11); i.e., it will be found 
that rrti! = m t ., but that usually not all mj = m.j. This is 
because Eq. 51 effects only a partial adjustment, each m t / being 
in error through the disparity between the nj proper to the jth 
column, and the average of all the ju; for the tth row, as seen in 
Eq. 49. This error can then be diminished by turning the process 
around and subjecting these to a proportionate adjustment in 
the vertical according to the equation 


// _ / m J 

rrta — mu , 


which may be considered an application of Eq. 50 wherein the dis¬ 
parity between any X z - and the average X* for the jth column has 
been neglected. It is the vertical conditions that will now be 
found satisfied, but perhaps not all of the horizontal ones, because 



[Ch. VII) 


ADJUSTING TO MARGINAL TOTALS 


117 


some of the row totals may have been disturbed. The cycle 
initiated by Eq. 51 is therefore repeated, and the process is con¬ 
tinued until the table reproduces itself and becomes rigid with the 
satisfaction of all the conditions, both horizontal and vertical. 
The final results theoretically do not coincide with the least squares 
solution, but in practice they usually do, closely enough. 

Usually two cycles suffice. In practice the work proceeds 
rapidly, requiring only about one-seventh as much time as setting 
up the normal equations and solving them. The Tables 3-5 show 
the various stages of the work when the method of iterative pro¬ 
portions is applied to the sample frequencies of Table 1. It will 
be noticed that the results of the third approximation (Table 5) 
are final, since if the process were continued, the table would only 
reproduce itself. 


TABLE 3 


The method of iterative proportions applied to the data of table 1 

(first stage) 

A proportionate adjustment by rows, by Eq. 51. Note that mi/ = mi., 
but that m./ j* m.j. 



j = 1 

2 

3 

4 

mi! 


i - 1 

3608 

778 

555 

312 

5253 

5252 

2 

1586 

399 

254 

157 

2396 

2395 

3 

1606 

433 

273 

120 

2432 

2432 

4 

10476 

2441 

1696 

1153 

15766 

15766 

5 

1660 

349 

169 

152 

2330 

2330 

6 

3910 

863 

548 

341 

5662 

5662 

m.j 

22846 

5263 

3495 

2235 

33839 


m.j 

22877 

5285 

3462 

2213 


33837 


46. Iterative proportions in three dimensions. The same 
process can be extended to three or more dimensions with an even 
greater relative saving in time. To see how the method of itera¬ 
tive proportions applies in one of the three-dimensional cases, we 
may go back to the three-dimensional Case III. By the substi¬ 
tution afforded through Eq. 35 the adjusting Eq. 33 may be put 



118 


CONDITIONS WITHOUT PARAMETERS [Sec. 46] 


TABLE 4 

A CONTINUATION OF THE PROCESS INITIATED IN TABLE 3 

(second stage) 

The figures in Table 8 are now adjusted proportionately by columns according to 
Eq. 62. The vertical totals m./ / and m.j now are equal, but the agreement of the 
horizontal totals accomplished in Table 8 has been slightly disturbed. 



j *= 1 

2 

3 

4 

mi!' 

mj. 

i = 1 

3613 

781 

550 

309 

5253 

5252 

2 

1588 

401 

252 

155 

2396 

2395 

3 

1608 

435 

270 

119 

2432 

2432 

4 

10490 

2451 

1680 

1142 

15763 

15766 

5 

1662 

350 

167 

151 

2330 

2330 

6 

3915 

867 

543 

338 

5663 

5662 

m.j 

22876 

5285 

3462 

2214 

33837 


m.j 

22877 

5285 

3462 

2213 


33837 


TABLE 5 

The cycle commenced again 
(third stage) 

The figures of Table 4 are subjected to a proportionate adjustment by rows, accord¬ 
ing to Eq. 61. And since these results turn out to be almost a reproduction of 
Table 4, but with both horizontal and vertical conditions satisfied, they are con¬ 
sidered final. The agreement with the mij in Table 1 should be noted. 



j “ 1 

2 

3 

4 

771 i. 

mi. 

i = 1 

3612 

781 

550 

309 

5252 

5252 

2 

1587 

401 

252 

155 

2395 

2395 

3 


435 

270 

119 

2432 

2432 

4 


2451 

1680 

1142 

15765 

15766 

5 

1662 

350 

167 

151 

2330 

2330 

a 

3914 

867 

543 

338 

5662 

5662 

— tn 
mu 

22875 

5285 

3462 

2214 

33836 


m.j 

22877 

5285 

3462 

2213 


33837 











[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


119 



Any of these three equations shows why the adjustment (33) is not 
proportional by slices, and why this case does not break up into r 
or $ or t sets of the three dimensional Case I. As a first approxi¬ 
mation it docs, as is now clear from these three equations, and by 
making successive proportionate adjustments we may thus arrive 
at the final values. To go about the work one could first calculate 
the values of 



, mi., 

mijk ftijic 

Hi,, 

(56) 

then 

U _ / Hl.j. 

Hlijk — Hflijk / 

(57) 


m.j. 


followed by 

m __ u Wl. k 

Hlijk Hflijk rr 

(58) 


m.. k 



These three successive adjustments would constitute a cycle, 
which would then be repeated in whole or in part until the table 
becomes rigid with the satisfaction of all three sets of conditions. 

47. Simplification when only one cell requires adjustment. On 
occasions it happens in sampling that one is especially interested 
in one particular cell of the universe, and would like to have a 
result for it in advance before the other cells are adjusted. Some¬ 
times it even happens that the others individually are of no particu¬ 
lar concern. In such circumstances one merely places the cell of 



120 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 47] 


interest in one corner of the table by an appropriate interchange of 
rows and columns, and then compresses the rest of the table into 
the cells adjacent to it. In the two-dimensional Case II one would 
thus work with a 2 X 2 table, one corner cell being the one of special 
interest, the other three being the result of compression. The 
marginal totals of the row and column belonging to the cell of 
interest are unaffected. For illustration we may suppose that 
from the sample shown in Table 1 we require only ra 6 1 . We then 
start with the 2X2 Table 6, which is derived from Table 1 by 
compression. Commencing with Table 6, one might first adjust 
by rows according to Eq. 51, then by columns by Eq. 52. One 
cycle of iterative proportions is sufficient, as is seen in Table 7, 
and the value 3915 found for m 6 1 is in good agreement with its 
value shown in Tables 1 and 5. The scheme of compression 
provides a quick method of getting out an advance adjustment for 
a cell of special interest, and the result so obtained will ordinarily 
be in good agreement with what comes later when and if all the 
cells are adjusted. 

In the three-dimensional Cases II, III, V, VI, and VII, one 
compresses the original table to a 2 X 2 X 2 table, and then uses 
the method of iterative proportions. (The other cases do not 
require consideration, since they are proportionate adjustments 
wherein one is already at liberty to adjust as few or as many cells 
as he likes without altering the equations or the routine.) The 

TABLE 6 


Derived from table 1 by compression, the cell i — Q,j = 1, requiring 

ADJUSTMENT 



j = 1 

j = 2-4 

rii. 


i = 1-5 

18965 

9250 

28215 

28175 

i = 6 

3882 

1740 

5622 

5662 

n.,- 

22847 

10990 

33837 



22877 

10960 


33837 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


121 


same procedure can be extended to the adjustment of two cells, the 
only modification being that in two dimensions we shall compress 
toa2X3ora3X3 table, depending on whether the two cells 
do or do not lie in the same row or column. In three dimensions 
we compress to a 2 X 2 X 3, or a 2 X 3 X 3, or a 3 X 3 X 3 
table; the first if the two cells lie in the same i, j, or k tube, the 
second if they lie in the same slice but not in the same tube, the 
third if they are in separate slices. 


TABLE 7 

A PROPORTIONATE ADJUSTMENT OF TABLE 6 


Rows adjusted by Eq. 51 

Columns adjusted by Eq. 52 

18938 

9237 

28157 

18962 

9213 

28175 

3910 

1752 

5662 

3915 

1747 

5662 

22848 

10989 

33837 

22877 

10960 

33837 


Conclusion: m^i = 3915 


48. The Stephan method. An iterative procedure devised by 
Stephan 2 has the advantage of being in theoretical agreement with 
the least squares solution. It is moreover self-checking and self- 
correcting, and requires the writing of only a few figures. Only 
one table is required, since the factors required in the computations 
are appended below and to the right, and all the figures needed can 
be written into this one table (see Table 8). The method will 
converge to the least squares solution even when some cells are 
vacant or contain huge sampling errors. This is possible because 
the method may be used under any desired system of weighting. 

Directions follow for carrying out the computations of the 
Stephan method in two dimensions when the weight of any ad¬ 
justed frequency is assumed to be inversely proportional to the 
corresponding sample frequency, as in the development of the 
normal equations (cf. Sec. 41). The numerical illustrations refer 

2 Frederick F. Stephan, “ An iterative method of adjusting sample frequency 
tables when expected marginal totals are known,” Annals of Mathematical 
Statistics , vol. XIII, No. 2, June 1942: pp. 166-178. 



122 


CONDITIONS WITHOUT PARAMETERS [Sec. 48] 


to Table 8 of this chapter, which is derived from the same sample 
frequencies as those in Table 1 . 

CYCLE 1 

1. Compute the factors p*(l) equal to m<./2rit.. Enter 
each factor in the proper row, one below the other, in the 
column headed p;( 1). (For instance, p 2 (l) = 2395/2 X 2371 

- 0.50506, and is entered opposite the second row.) 

2. Multiply each sample frequency n t; - in column j by the 
corresponding factor p,*(l). These products are not needed 
individually; they are to be accumulated in the product register 
of the machine until the vertical total for the column is ob¬ 
tained. (In column 4, this vertical total is 313 X 0.49791 
+ 155 X 0.50506 + 116 X 0.51701 + 1160 X 0.49707 + 154 
X 0.49385 + 339 X 0.50356 = 1117.46423.) This total is not 
to be written down, but is to be transferred to the keyboard for 
the subtraction called for in the next step. 

3. Subtract this accumulated total from the corresponding 
deflated universe column total m.j. Then divide this difference 
by the corresponding sample column total n.y to get the factor 
g,(l). (For instance, j m A — £ n »4 X Pt(l)}/w .4 = {2213 

- 1117.46423}/2237 = 0.48973 = q A (l). This is the only 
figure written down from steps 2 and 3.) Do steps 2 and 3 for 
every column. 

cycle 2 

4. Multiply each sample frequency in row i by its correspond¬ 
ing factor £/(l). These products are not needed individually; 
they are to be accumulated in the product register of the 
machine until the horizontal total for the row is obtained. (In 
row 2, this horizontal total is 1570 X 0.50134 -f 395 X 0.50453 
+ 251 X 0.49099 + 155 X 0.48973 = 1185.53979.) This total 
is not to be written down, but is to be transferred to the key¬ 
board for the subtraction called for in the next step. 

5. Subtract this accumulated total from the corresponding 
deflated universe row total mi.. Then divide this difference by 
the corresponding sample row total m. to get the factor 
Pi(2). (For instance, {m 2 . — £ ^ 2 / X #/(l) }/n 2 . = {2395 

- 1185.53979}/2371 = 0.51011 = p 2 ( 2).) Do steps 4 and 5 
for every row. 

6. Repeat step 2, using the factors p»(2). (In column 4, 
the vertical total is 313 X 0.49580 + 155 X 0.51011 + 116 
X 0.53384 + 1160 X 0.49426 + 154 X 0.48740 + 339 X 0.50699 
= 1116.44870.) 



[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


123 


7. Repeat step 3 to get the factors 3,(2). (For instance, 
{m.4 - !>» 4 X p<(2)}/n. 4 = {2213 - 1116.44870)/2237 
= 0.49019 = q<( 2).) 

CYCLE 3 

' The process can be continued, that is, steps 5 to 7 can be 
repeated again and again. In practice, cycle 3 is often the 
last one. 


THE FINAL STEP; THE ADJUSTED TABLE 

The process will be stopped when another cycle would merely 
result in a repetition of the same factors. When this stage is 
reached, the factor p t - + is formed and multiplied by the 
corresponding sample frequency w*/, and the product is written 
in cell ij beneath the corresponding sample frequency riij. 
(For instance, for the cell i — 3, j = 2, the adjusted sample 
frequency is 419(0.53385 -f 0.50431) = 435, and this is written 
beneath the sample frequency 419.) 

In the illustration, there was no need of going beyond the 
second cycle, since, as will be observed, the p and q factors 
obtained in the third cycle are practically identical with those 
obtained in the second. But of course, one could not perceive 
this without going through the third cycle. 

It was mentioned earlier that the process is self-correcting. 
If a mistake is made somewhere in the computations, the 
process will converge faster or slower, depending on the magni¬ 
tude and direction of the mistake. I 11 consequence, fewer or 
more cycles will be required before the factors repeat themselves. 
The end result will nevertheless be the same as if no mistake 
had been made. The computer may therefore assume that 
when the factors repeat, his work is correct, and he is ready 
for the final step. 

Since the Stephan method gives the least squares solution, the 
italicized figures in Table 8 (p. 124) are identical, except for round¬ 
ing errors, with the results in Table 1 (p. 107), which were obtained 
by the use of normal equations. The least squares results in both 
Tables 1 and 8 are in close agreement with those yielded by the 
method of iterative proportions in Table 5 (p. 118), and with the 
results to be obtained by the BruytTe method (next section) in 
Table 10 (p. 126). 

The choice between the different short-cuts (iterative proportions, 
Stephan, and Bruy£re) may reasonably lie in personal preference, 
though the Stephan method has certain theoretical and practical 



124 


CONDITIONS WITHOUT PARAMETERS 


[Sec. 49] 


advantages, as mentioned above, and in some situations these 
weigh heavily in its favor. 

The Stephan method has been extended in the Census to three 
dimensions; for general instructions, see the reference to Stephan 
in footnote 2 on page 121. 


TABLE 8 


The Stephan adjustment applied to the example shown in table 1 


n.i 

m.j 

m 

2 

3 

4 

ni. 

rrii. 

Pitt) 

Pi (2) 

Pi (3) 


3623 

3618 

781 

781 

557 

650 

313 

809 

5274 

5252 

0.49791 

0.49580 

0.49580 


B 

395 

401 

251 

251 

155 

155 

2371 

2395 

.50506 

.51011 

.51011 


1553 

1608 

419 

436 

264 

271 

116 

119 

2352 

2432 

.51701 

.53384 

.53385 


wm 

2455 

2451 

1706 

1681 

1160 

1142 

15859 

15766 

.49707 

.49426 

.49426 


1681 

1662 

353 
350 j 

171 

167 

154 

151 

2359 

2330 

.49385 

.48740 

.48739 


3882 

8914 

857 

867 

■ 

-) 

544 

548 

339 

338 

5622 

5662 

.50356 

.50699 

.50699 


22847 

22877 

5260 

5285 

3493 

3462 

2237 

2213 

33837 

38837 


9/(1) 

0.50134 

0.50453 

0.49099 

0.48973 



9/(2) 

.50137 

.50431 

.49084 

.49019 


9/(3) 

.50137 

.50431 

.49084 

.49019 







The adjusted frequencies (italicized) are rounded off, hence when 
summed may occasionally disagree a unit or so with the expected marginal 
totals (also italicized). The latter arise by deflation from the universe rather 
than by direct addition of the m<y. 

49. The Bruyere method. This method is closely related to the 
other two short-cuts, and may be described as a precipitous 









[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


125 


forcing at the end of the first half-cycle. It was shown to me by 
Dr. Paul T. Bruy&re, who had devised it some years earlier when 
he encountered the problem of adjusting sample frequencies in 
connexion with some surveys in medical research. It does not 
give a least squares solution, but it is good enough, and has the 
advantage of being the most rapid of all methods here explained. 

1. Same as the first step in the method of iterative propor¬ 

tions (Sec. 45): multiply the sample frequencies n»/ in Row i 
by the ratio as in Eq. 51. (This ratio will vary from 

row to row.) Do this for every row. 

For illustration, this proportionate adjustment will be carried 
out on Table 1 (p. 107), the result being Table 3 (p. 117). 

2. Form the column total m./. Subtract it (usually men¬ 
tally) from the known total m.j, and enter it as a “vertical 
discrepancy ” along the top of a new table (Table 9). Do this 
for every column. In the same way, form also the resulting 
horizontal discrepancies, m». — m</. (These would be zero 
except for errors in rounding off to integers.) 

3. Make up a table of corrections (Table 9), based on the 
vertical discrepancies found in step 2. Distribute any one of 
these discrepancies amongst the cells in that column, in propor¬ 
tion to the row totals, nii.. To do this, first calculate the ratios 

where n is the total sample. Enter these ratios along the 
left of Table 9: they constitute the multipliers for forming the 
final corrections, which are entered in the body of the table. 

The correction to be entered in row i and column j is the product 
of rrii./n by the discrepancy in column j. 

4. These corrections must now be forced, to equal the col¬ 
umnar discrepancies, exactly. This forcing is to be carried out 
so that (a) the sum of the corrections in any row equals the 
corresponding “ horizontal discrepancy,” written in step 2 
and entered in the right-hand column of Table 9, and so that 
(b) the sum of the corrections in any column equals the corre¬ 
sponding “ vertical discrepancy/’ also written in step 2, and 
entered near the top of Table 9. Parts (a) and (b) are entirely 
independent. In Table 9 the forcing is indicated by putting 
parentheses around a figure obtained in step 3, and writing a 
new figure just to the left. Usually the forcing is small (a unit 
or so in any cell), and needs to be done in only a few cells. 
Large cells should be altered in the forcing, rather than small 
ones. (See page 84.) 



126 


CONDITIONS WITHOUT PARAMETERS [Sec. 49] 


TABLE 9 

Corrections for forcing the marginal totals 
(Bruy^re method) 

(Steps 3 , and 4) 



mi. 

n 

j = 1 

31 

2 

Vertic 

(writ 

22 

3 

al discrepanc 
ten in step 2 
-33 

« S3 

i 

8 _ 

Row 

sums 

Horizontal dis¬ 
crepancies 
(written in 
step 2) 

i = 1 

0.15521 

5 

3 

-6 (-5) 

-3 

-1 (0) 

-1 

2 

.07078 

2 

1 (2) 

-3 (-2) 

-1 (-2) 

-1 (0) 

-1 

3 

.07187 

2 

2 

-2 

-2 

0 

0 

4 

.46594 

15 (14) 

10 

-15 

-10 

0 (-1) 

0 

5 

.06886 

2 

2 

-2 

-2 

0 

0 

6 

.16733 

5 

4 

-5 (-6) 

-4 

0 (-1) 

0 

Column sums 

31 (30) 

22 (23) 

-33 (-32) 

-22 (-23) 




TABLE 10 

The final results obtained by the Bruyere method 



j = 1 

2 

3 

4 

m{. n 

mi. 

i = 1 

3613 

781 

549 

309 

5252 

5252 

2 

1588 

400 

251 

156 

2395 

2395 

3 

1608 

435 

271 

118 

2432 

2432 

4 

10491 

2451 

1681 

1143 

15766 

15766 

5 

1662 

351 

167 

150 

2330 

2330 

6 

3915 

867 

543 

337 

5662 

5662 

m.j 

22877 

5285 

3462 

2213 

| 33837 

m.j 

22877 

5285 

3462 

2213 




[Ch. VII] 


ADJUSTING TO MARGINAL TOTALS 


127 


5. Add each forced correction to the corresponding frequency 
in Table 3. The result is Table 10, which is the end product. 

Both row and column totals will agree with the controls, and 
the work is finished, except for multiplying all the frequencies in 
Table 10 by N/n, to make them correspond with the popu¬ 
lation values (not shown here). 

60. Some remarks on the accuracy of an adjustment. A least 
squares adjustment of sampling results must be regarded as a 
systematic procedure for obtaining satisfaction of the conditions 
imposed, and at the same time effecting an improvement of the 
data in the sense of obtaining results of smaller variance than the 
sample itself, under ideal conditions of sampling from a stable 
universe. As a matter of fact, the variance of the residuals arising 
in the adjusted cells will decrease with the difference between the 
total number of cells and the number of control totals, according 
to the results of Exercise 3 at the end of Chapter V (p. 68). It 
must not be supposed that any particular adjusted cell frequency 
is necessarily better than the original sample frequency in the 
sense of being closer to the complete count. It may be, but also 
it may not be, and there is no statistical way of discovering which. 
All we know is that on the average the adjusted cell frequencies 
will be better. 

But the decrease in variance is not all; adjustment to known 
control totals has at the same time the effect of eliminating biases 
in the nature of inherent differences between the sample and com¬ 
plete count. This effect is often more important than the decrease 
in the variance of the sample frequencies. 

It is desirable to get some idea of the errors of sampling by actual 
trial, such as by a comparison of certain sampling marginal totals 
with the corresponding universe totals, as can often be arranged by 
means of controls. Also, the sample can be tested for regularity 
of patterns. There is another aspect to the problem of error — 
even a 100 percent count is not by itself useful for formulating 
social and economic plans, except so far as we can assert on other 
grounds what secular changes are taking place. 



Part D 

CONDITIONS CONTAINING PARAMETERS 


CHAPTER VIII 

CURVE FITTING IN MORE COMPLICATED CIRCUM¬ 
STANCES 

61. Some general remarks on the purpose of curve fitting. To 

extend the theme of Chapter I, we may say that the reason for 
fitting a curve to a set of data is to summarize the evidence pro¬ 
vided by that experiment for making predictions with regard to 
future data. It is not the data fitted that are of primary interest: 
it is the data of the next experiment that one holds in awe. Will the 
curve fit, or will it not? And when we decide whether the curve 
fits, we do so on the basis of whether it fits well enough to give 
useful results. Are deductions (predictions) made from this curve 
borne out in practice closely enough so that it can be used as a 
basis for action ? The method that gives the best predictions is 
the best method. 

There have been many instances when deductions made from a 
fitted curve, or from a series of curves, have made it unnecessary 
to perform certain other experiments. As an instance, we may 
turn to Example 1 of Chapter XI, where a quartic is fitted to some 
compressibility data published by the Michels in Amsterdam. 
This quartic, when fitted to their compressibility data on carbon 
dioxide, and differentiated, integrated, and otherwise evaluated, 
gives data on the index of refraction, the Joule-Thomson coefficient, 
entropy, and other physical properties, that would be difficult and 
time consuming, if direct observation were required. When we 
say that the quartic, fitted to the compressibility data, gives values 
of the Joule-Thomson coefficient, we mean that for certain pur- 

128 



[Ch. VIII] 


CURVE FITTING 


129 


poses, the prediction is satisfactory in place of the Joule-Thomson 
coefficient that would be observed directly. Certain checks, in 
terms of other experience and other deductions, which may be 
available in isolated portions of the ranges of pressure, volume, and 
temperature that are covered by the compressibility data, lend 
confidence to the results — confidence of the kind that can be 
translated into action, such as the design of compressors for refrig¬ 
erators and other machinery. 

In order to extend the region of prediction into other areas and 
other ranges (e.g., other cities, other economic levels, higher pres¬ 
sures, higher temperatures), not yet included in the experiments, 
it is necessary to have related experience, or meanwhile to regard 
extrapolations as pure conjectures. Such conjectures may be 
regarded as predictions, but without a high degree of belief, and 
not as a scientific basis for action. 

It is important to keep in mind the ultimate purpose of curve 
fitting, particularly when one is actually fitting curves for purposes 
of action. Meanwhile, it is necessary that one learn to perform or 
understand some of the procedures by which curves can be fitted. 
To this end, we resume our study of the adjustment of observations, 
returning to the general solution worked out in Chapter IV. 

It will be recalled that earlier in the book some simple problems 
in curve fitting were treated (Secs. 9 and 10, the single sample; 
Sec. 12, several samples; Sec. 15, a line through the origin). These 
problems were simple, not just because the functions were simple 
ones, but also because the errors in the variables entered in such 
manner that the parameters (adjustable constants) could be found 
directly by differentiating S. The solutions obtained for those 
circumstances were and still are satisfactory, but the research 
worker must be prepared for more complicated situations, such as 
both variables subject to error, or functions in which the param¬ 
eters and the errors do not enter in so simple a manner. 

A framework will be developed in this chapter for more com¬ 
plicated circumstances. The simple problems just mentioned will 
of course fit into this framework, as will occasionally be pointed out. 
Fortunately, the solution is already worked out in general terms; 
it is contained in the general normal equations on page 55. All 



130 


CONDITIONS CONTAINING PARAMETERS [Sec. 52] 


we need to do now is to apply this solution to curve fitting, and see 
how it can be adapted to routine computational procedure. 

There will be a function to be fitted. We might write it 

F(x, y; a, b, c) = 0 (1) 

to indicate that there is an equation involving x and y } and the 
(adjustable) parameters a, b , and c. In the examples already 
seen in Sections 9, 12, and 15, the functions were simple, namely, 
x = a, and y = bx. 

The problems considered in the last three chapters (constituting 
Part C) did not contain parameters; the conditions were rigorous, 
being geometric, or forced by complete counts. No question arose 
concerning the adjustment of parameters, because there was none. 
We threw away with abandon certain rows and columns of the 
general solution that owed their origin to the parameters (p. 59). 
Now, however, we can not do this; the condition equations contain 
parameters, and we must deal with them. We shall see, though, 
that there will be simplifications of other kinds, and we shall 
develop a procedure not unlike that contained in -Chapters Y 
and VI. 

62. Graphical considerations. It is desirable to have in mind 
the picture of curve fitting shown in Figs. 16 and 17, pages 132 
and 133. There are observed points, calculated points, and true 
points. The calculated points, by definition, lie on the calculated 
curve, and the true points lie on the true curve. The equation of 
the calculated curve may be written in the form of Eq. 1, wherein 
x and y are the coordinates of any point on the curve, and a, b , c 
are the calculated (or adjusted) values of the parameters, which 
are to be found in the solution (see Eqs. 6, p. 52). The equation 
of the true curve is the same, except that it is drawn with the true 
parameters a, /3, y. It is assumed that if the errors of observation 
were negligible, the observed coordinates X and Y would satisfy 
the true curve. Actually, however, the observed points do not lie 
on either the calculated curve or the true curve; in fact, owing to 
errors of observation, the observed points usually do not lie on 
any curve at all of the form of Eq. 1, though they may approximate 
one closely. 



[Ch. VIII] 


CURVE FITTING 


131 


As in Chapter IV, we shall need some approximate parameters 
a 0) bo, Co to start off with. If they were used for drawing the curve, 
in place of a, b, c, there would be still another curve (not shown) 
in Figs. 16 and 17, which might be called the approximate curve 
if it needed a name. 

The calculated coordinates, and the calculated parameters, 
satisfy Eq. 1 exactly. The observed coordinates, however, and 
the approximate parameters Oq, bo, c 0 , do not. When, for any 
point, the observed coordinates X and F, and the approximate 
parameters, are substituted into the left-hand side of Eq. 1, the 
equation is usually not satisfied, which is to say that the left-hand 
side is usually not zero, but is instead some small quantity Fo, 
defined as 


Fo = F(X, F; a 0 , b 0 , c 0 ) (Cf. Eq. 5, p. 52.) (2) 

Of course, F Q may by accident be zero at some point, and will be 
zero by design at any point through which the approximate curve 
is forced to pass, as when the method of selected points is used to 
determine satisfactory approximate values a 0 , b 0 , c 0 (cf. the reduced 
type at the end of Sec. 55). The quantities F 0 at point No. 1, 
No. 2, etc., will appear in the normal equations of Section 55. 

For simplicity and definiteness, the development will be written 
out for only two coordinates, x and y , at each point. The extension 
to three coordinates is obvious, in which event Eq. 1, instead of 
being the equation of a curve in the x , y plane, is written as the 
surface F (x f y,z; a, b, c) = 0 in the x, y , z space. See Example3 on 
pp. 231 ff. for an illustration in three dimensions, and Exercise 26 of 
Section 71 for one in four dimensions. An increase in the number 
of dimensions does not necessarily increase the complexity of a 
problem. 

A point is observed to be A, F; that is, the x coordinate of some 
true point £, rj is measured, perhaps several times, and the mean of 
these measurements is X with weight w x . Likewise, the y coordi¬ 
nate of the same (true) point is measured, perhaps several times, 
and the mean of them is F with weight w y . By Eq. 16 on page 22, 
the weights of the observed coordinates X and F at this point will 
be in the inverse ratio of the variances of protracted random series 



132 


CONDITIONS CONTAINING PARAMETERS [Sec. 53] 


of measurements on the true coordinates £, r) (Ch. I), and directly 
in the ratio of the numbers of observations taken. Otherwise 
expressed, if these variances are denoted by Var x and Var y , and 
the numbers of observations by N x and N v , then the ratio of the 
weights will be 


w x : w y 


N x Var y 
N y Varx 


(3) 




F(x,y : a,A 

„ xr 

' F(x.y ; a,b.c) a O 

o OBSERVED 
x CALCULATED 
a TRUE 

— calc’d curve 

— true curve 



Fig. 16. A typical situation in curve fitting. It is assumed that the " true 
points,” wherever they are, lie on the “ true curve ” F(x , y; a, 0, y) = 
0, a, 0, y being the true and unknown values of the parameters. The “ cal¬ 
culated points” all lie on the lt calculated curve” F(x, y\ a , b, c ) = 0, a, 6, c 
being the calculated values of the parameters. This figure and the next one 
first appeared in an article entitled “ On the chi-test and curve fitting,” 
J. Amer. Slat. Assoc., vol. 29, 1934: pp. 372-382. 


Of course, this ratio may vary from point to point, depending on 
the variation of the factors on the right (cf. Example 2, pp. 218- 
230). 

53. The conditions. For each point observed, there is a cal¬ 
culated point, and this calculated point is forced to lie on the 
calculated curve (preceding section). The residuals must be just 
the distances required to put the calculated point on the calculated 
curve; see Figs. 16 and 17. Now since the calculated point must 
lie on the calculated curve, its coordinates must satisfy Eq. 1 
(p. 130), which is to say that 

F(x, y\ a, 6, c) 

must vanish at every one of the calculated points, as indeed it must 
at all points along the calculated curve. Thus, for every point 



[Ch. VIII] 


CURVE FITTING 


133 


there is one condition imposed on the adjusted coordinates. 
Altogether, there are as many conditions as there are points. For 
n points there are n conditions. 

Next, we look at the general normal equations (p. 55), also at 
the conditions that were imposed (Eqs. 3, p. 50), and we seek a 

CALC'O CURVE 



Fig. 17. Relations between the “ true,” “ observed/’ and “ calculated ” 
points. The x and y coordinates of a point are observed; these observations 
when plotted give the “ observed points.” The point that was measured 
is the “ true point,” which is unknown and lies on the unknown ” true curve ” 
F(x t y; a , /3, y) = 0, a, p, y being the true but unknown values of the param¬ 
eters. The “ calculated curve ” is found by adjusting a series of observed 
points; its equation will have the same form as the true curve, but the param¬ 
eters therein will be the “calculated parameters” a, b , c. Corresponding to 
each observed point there will be a “ calculated point,” whose coordinates 
are found by subtracting the “ residuals ” V x and V y from the observed co¬ 
ordinates X and Y. E x and E v denote the “ errors in the observed points ”; 
E Xf E V} and U x and U v are unknown, but V x and V y are calculated along with 
the parameters a, b, c by the method of least squares. As the figure happens to 
be drawn, each of the six quantities E Xf E y , U Xt U v , V Xf V y is positive. Their 
signs are indicated by the directions of the arrows. 

way of writing these conditions so as to force the calculated points 
to lie on the calculated curve. This can be done by writing the 
condition functions in the form 

F h = F{xh,Vh\ a, b, c) h=l,2,---,n (4) 

wherein the function F on the right is the function found in Eq. 1, 
which is to be fitted, and Xh, yh are the final calculated or adjusted 
coordinates at point h. These coordinates, along with the cal- 





134 


CONDITIONS CONTAINING PARAMETERS [Sec. 54] 


culated parameters a, 6, and c, are to have such values that the 
function F on the right vanishes at every calculated point. 

64. The L coefficients. Immediately upon writing the condi¬ 
tion functions in this way, we perceive that the coordinates of 
point h will enter the condition function F(xh } yh ; cl, b, c) for that 
point, but not the condition function F(x g , y 0 ; a, b, c) for some 
other point g. As a consequence, 1 


d 

— F(x h , y h ; a, b, c) = 0 

dXg 

d 

— F(x h , y h ; a, b, c) = 0 

dy g 


if g h 


(5) 


It then follows from Eq. 14, page 55 (wherein L was defined for 
the general solution), that 

L g h = 0 when g 9^ h (6) 


This means that the L coefficients in the general normal equations 
(p. 55) standing off the diagonal are zero. 1 Moreover, the L 
coefficients standing on the diagonal contain only two terms each, 
because all the other derivatives are zero. For these two, we may 
write 


and 


whereupon 


F x = 


_d_ 

dx h 


F(x h , y h ; 


a, b, c ) 




— F(x h , 2m; a, b, c) 
dyh 


FxFx FyFy 
W X Wy 


(7) 


( 8 ) 


The suffix h has been omitted, it being simpler to leave it under¬ 
stood that the derivatives and the weights, and hence the L coeffi¬ 
cients, may vary from point to point. 

1 A problem in which the coordinates of one point do enter the condition 
function for an adjacent point was published by the author in the Phil . Mag., 
vol. 17, 1934: pp. 804-829. The problem dealt with the oscillations of the 
pointer on a chemical balance. 



ICii. VIII] 


CURVE FITTING 


135 


Remark 1 . The first term on the right of Eq. 8 drops out at 
any point where X is free of error, for then the denominator 
w x is oo ; and similarly the second term drops out if Y is free of 
error. If both X and Y are subject to error, the two terms may 
be of comparable magnitude, and both accordingly retained. 
(Cf. Remark 3 in Exercise 4 of Sec. 65, p. 181.) 

Remark 2. From the way in which the L coefficients enter 
the normal equations, and affect the standard errors of the 
parameters (Sec. 62), it will be seen that one object in the design 
of an experiment should be to produce small values of L. The 
two terms on the right of Eq. 8 give an indication of where time 
and funds may wisely be apportioned. If one term is already 
small (possibly a quarter) compared with the other, then it 
might not be worth the necessary expenditure to reduce that 
term, already small, to half its value. Better it would be to 
spend even more time and funds to halve the larger term. Sim¬ 
ilarly, if the L coefficient at one point is already small compared 
with the L coefficients at some of the other points, then instead 
of using time and funds to reduce the small one further, it 
might be wiser first to reduce the L coefficients at those points 
where they are largest. 

Remark 3. By Eq. 9, page 40, we see that the L coefficient 
at a particular point is none other than the reciprocal of the 
weight of F evaluated with the corresponding observed coordi¬ 
nates X, Y. This value of F would be written F(X, Y ; a,b,c ). 
It is the quantity designated as F </ in Exercises 3, 4, and 5 of 
Section 58, pages 145-146. Since L is the reciprocal of a weight, 
1 /W will frequently be written in place of L as we go along. 
(Cf. Remark 3, p. 181.) 


66. The normal equations for curve fitting. The general normal 
equations (p. 55) now take the form shown below. These can 


Xi 

\2 

As 

* A n 

A 

B 

C 

= 1 

Li 

0 

0 

0 

Fa 1 

F b l 

Fc 1 

Fq 1 

0 

l 2 

0 ; 

0 

F a 2 

Ft 2 

F c * 

F 0 2 

0 

0 

Lz 

; 0 

F a * 

F b * 

F c * 

F 0 * 




• 




• (9) 

0 

0 

0 

• L n 

F a n 

F b n 

F 0 n 

F 0 n 

F a l 

Fa 2 

F„* • 

F a n 

0 

0 

0 

0 

F b l 

Ft 2 

F b * 

F b n 

0 

0 

0 

0 

Fc l 

F 2 

F c * 

F c n 

0 

0 

0 

0 




136 


CONDITIONS CONTAINING PARAMETERS [Sec. 55] 


quickly be reduced to a smaller set, in number equal to the number 
of adjustable parameters. First, eliminate Xi, X 2 , • • •, X n by solving 
for them in the upper n equations, getting 


Xi = 7- (Fo 1 - Fa 1 A - FbB - F c l C) 

\2 = 7- ( Fo 2 - F 2 A - F b 2 B - F 2 C) 

B 2 


(A X for 

each point) (10) 


x» = 7- (Fo n - F a n A - F b n B - F c n C) 

L n 

Then substitute these values of Xi, X 2 , • • •, X n into the lower three 
rows of Eqs. 9. The result is Eqs. 11. As in Chapter Y (see 
p. 59), the coefficients below the diagonal have been omitted. 


A 

B 

c 

= I 

r FaFa~\ 

r F a Fi 1 

1 r w] 


L L J 

L L J 

L L J 

L L J 


rF b F b i 

1 

[F b F 0 -] 


L L J 

L 1 J 

L L J 







L L J 

l L j 


(The normal 
equations for (11) 
curve fitting) 


These are the normal equations for curve fitting. They contain 
only the parameter-residuals A, B, C as unknowns. The arrange¬ 
ment of the coefficients is symmetrical, and their quadratic form 
positive definite, like the general normal equations whence they 
came (Sec. 28). Once the parameter-residuals A, 2?, and C are 
obtained from the solution of the normal equations, the adjusted 
values a, 5, c of the parameters are found immediately by subtrac¬ 
tion. More explicitly, 

a = a 0 — A ) 

b = b 0 - B ^ (Eqs. 6, p. 52) 
c = c 0 — C J 

The calculated curve is Eq. 1 into which the adjusted values of 
a, b } and c have been inserted. 






[Ch. VIII] 


CURVE FITTING 


137 


Several details remain: to adjust the observations (Secs. 56 
and 58); to work out a systematic procedure for forming the 
normal equations and solving them (Secs. 60 and 61); to discover 
in this systematic procedure a quick way of calculating the mini¬ 
mized sum of squares, S (p. 57 and the exercise following); to 
calculate the variance and product variance coefficients of a, b, c 
(found in the reciprocal matrix, Secs. 61 and 62). 

It is important to observe that the final values of a, b } and 
c will be independent of the approximations ao, 6 0 , Co- That is 
to say, two computers, starting off with slightly different ap¬ 
proximations (but with the same observations), will find their 
parameter-residuals A, B, and C to be just enough different so 
that their final calculated values of a, b , and c, and hence their 
final calculated curves, are practically identical. 

Under some circumstances, and for some purposes, however, 
the approximations ao, bo, Cq must not be too rough. In other 
problems it makes no difference what these approximations 
are, except that always the rougher they are, the more figures 
are required in the normal equations, hence the greater the 
computational effort required. These remarks are repeated 
more specifically in Exercises 4, 5, and 10 in Chapter X (pp. 179, 

183, and 187). 

In regard to the matter of arriving at the approximations 
ao, bo, Co, it should be made clear at the outset that in practice 
this is usually not difficult. Often one will have good enough 
approximations simply from previous experience. There are 
graphical methods, by which one draws in a curve free-hand, 
after making a judicious choice of scales, such as changing 
y = ae bx into the logarithmic form In y = In a -f bx, to make 
it straight. There is the “ method of averages,” called by 
Norman Campbell 2 the “ method of zero sum,” by which one 
finds what values of a , b, and c will force the calculated curve to 

2 Norman Campbell, Phil. Mag., vol. 39, 1920: pp. 177-194. See also 
Whittaker and Robinson’s Calculus of Observations , Art. 131, p. 258. Ac¬ 
cording to them, the method of averages (i) was much used in the latter 
half of the 18th century; (ii) was first published by Tobias Mayer in 1748 
and 1760. A recent paper by Wald contains some interesting and valuable 
theoretical work on the method of averages. It turns out that when the x 
and y observations have weights in constant ratio, the method of averages is 
unbiased, and in statistical efficiency compares well with the method of least 
squares, at a considerable saving in labor (in agreement with Campbell). 
The reference to Wald’s work is the Annals of Math. Statistics , vol. 11, 1940: 
pp. 284-300. 



138 


CONDITIONS CONTAINING PARAMETERS [Sec. 56] 


average out correctly over groups of points (three groups if 
there are three parameters). Then there is Cauchy’s method, 3 
which has much to recommend it. 

Lastly, there is the so-called method of selected points, con¬ 
cerning which brief mention was made at the end of Section 25, 
page 52. By this method one simply selects three points — 
usually two end points and a middle point, if there are three 
parameters — and solves three simultaneous equations to find 
what values of a, 6, and c force the calculated curve to pass 
through these points. The values so found serve as a 0 , 6 0 , Co. 

If there are two parameters, two points are selected, and so on. 

This calculation is often fairly simple to carry out, and it pos¬ 
sesses the advantage of giving the computer zero values for 
three of his F 0 functions, thus slightly cutting down his com¬ 
putational effort. 

The method of selected points is much used and easily justi¬ 
fied on grounds of simplicity, yet it is about the worst conceiv¬ 
able method of curve fitting. If the computer is not careful to 
select representative points, he throws away practically all the 
information contained in the rest of the points. Yet this much 
can be said, if there were no errors in any of the points, it would 
yield the correct results for a, b, and c. 

For free-hand methods of curve fitting, and for general ad¬ 
vice in the interpretation of statistical calculations, Ezekiel’s 
Methods of Correlation Analysis (John Wiley, 2d ed., 1941) is 
heartily recommended. 

66. Adjusting the observations, or finding the calculated points. 
Going back to Eqs. 12 on page 54, we see that the x and y residuals 
will depend on the Lagrange multipliers in the following manner — 

(x residual at point h) (12x) 


(: y residual at point h) (12 y) 

'V 


V x = - \ h F x 
w x 

Vy = - X h F y 


Once the residuals V x and V y have been computed, the adjusted 
(calculated) coordinates at point h can be found from the equations 


%h = Xh — V x 
Vh = Y h — V y 


(Cf. Eqs. 6, p. 52.) 


(13*) 

(13t/) 


3 Cauchy, Comptes rendus , vol. 25, 1847: p. 650. 



[Ch. VIII] 


CURVE FITTING 


139 


The numerical value of the Lagrange multiplier X* can be 
found from Eq. 10 (p. 136). F x and F v are numerical values of 
the derivatives of F at point h. After differentiation, F x and F y 
are functions of x, y, and a, b, c. We can not evaluate these 
derivatives numerically at the calculated point h until we find 
the residuals and the calculated coordinates, but that is just 
what we are trying to do now. In practice, fortunately, for use 
in Eqs. 10 and 12 it is sufficient to evaluate the derivatives at 
the observed point, using the approximate parameters, though 
the final calculated parameters can be used if desired. 

Eqs. 13 give the coordinates of the calculated point correspond¬ 
ing to the observed point Xh, Yh. Finding the calculated points 
is the process of adjusting the observations ; when Xh and yh have 
been calculated, the observations Xh, Yh are said to be adjusted. 
The calculated point Xh, yh is the least squares estimate of the posi¬ 
tion of the unknown true point Vh> Obviously, it will depend 
not only upon Xh, Yh, and their weights, but also more or less upon 
all the other points and their weights. In randomness, the 
variances of the calculated coordinates are less than the variances 
of the observed coordinates. 

Just how is this dependence tied up with the other points? 
Through the normal equations (Eqs. 9, p. 135), or their equivalent, 
Eqs. 10 and 11. Eqs. 11 supply the parameter-residuals A, B , 
and C, which are to be used in Eqs. 10 to find Xi, X 2 , • • •, X n . 
These in turn are used in Eqs. 12 to compute the x and y residuals 
at each point, by which the observations are adjusted, as indicated 
in Eqs. 13. 

We now have a method of adjusting the observations and of 
estimating the parameters a, 6, c, when both the x and y coordinates 
are subject to error, but it must be remembered that the solution 
depends on certain simplifying assumptions; namely, that the 
squares and higher powers of the residuals can be neglected in the 
Taylor series of Chapter IV. 

Remark 1. A familiar method of adjusting the observations, 
valid when all the x coordinates are free of error, is to substitute 
the coordinate free of error into the formula F(x, y\ a , 6, c) - 0 
(Eq. 1), and solve for the other coordinate. Thus in the parab¬ 
ola, 


y = a + bx + cx 2 


(14) 



140 


CONDITIONS CONTAINING PARAMETERS [Sec. 56] 


if x is free of error, it is easy to calculate y for a given x, once 
a, b, and c are determined. But if x is subject to error, one 
must either solve a quadratic for x in terms of a known y, or — 
what is usually easier — adjust the x coordinate by using 
Eqs. 12 and 13, after evaluating the Lagrange multipliers in 
Eqs. 10. 

When the function F of Eq. 1 is not solved for y explicitly, 
it may be easier to adjust the y coordinates by means of Eqs. 10, 

12, and 13, rather than to substitute directly into Eq. 1 and solve 
for y in terms of x. Similar remarks apply for adjusting the 
coordinates when x alone is subject to error. 

When both coordinates are subject to error, one must apply 
Eqs. 10, 12, and 13, if he would adjust the observations. (For 
a numerical illustration, see the example treated in Section 78, 
pages 227 ff. As an exercise, the student could at this time 
work out the numerical values of the remaining nine calculated 
points in that example.) 

Remark 2 . Gauss and others gave methods for adjusting 
the observations in problems of geodesy and astronomy 
(Chs. V, VI, and VII, constituting Part C). Unfortunately, 
they did not give much attention to problems in which the 
conditions contain parameters (curve fitting), especially when 
more than one coordinate is subject to error. 

It has sometimes been said that least squares is reasonable 
enough in surveying and astronomy, but that it is illogical and 
equivocal in curve fitting. Actually, the principle of least 
squares is always the same (p.14). The distinction between the 
problems lies in the conditions that the adjusted quantities are 
subjected to. A neglected but worthy paper by Kummell 4 in 
1879, had it not been overlooked, could have set matters 
straight. Later papers by Stewart 5 (1920) and Uhler 6 (1923) 
also emphasized the unity of the different kinds of problems. 

Remark 3. The term “ adjustment of observations,” as 
it has often been applied heretofore to curve fitting, has meant 
a calculation of the parameters a, b, c from a set of data. Now 
we see that the parameters enter only as unknowns in the 
conditions that are forced upon the adjusted quantities. Least 
squares is primarily a method of adjusting the observations, 
and the parameters enter only incidentally. As a matter of 
fact, least squares is the only method of curve fitting by which 

4 Charles H. Kummell, The Analyst (Des Moines), vol. 6, 1879: pp. 97-105. 

6 R. Meldrum Stewart, Phil Mag., vol. 40, 1920: pp. 217-227. 

6 Horace S. Uhler, J. Optical Soc., vol. 7, 1923: pp. 1043-66. 



[Ch. VIII] 


CURVE FITTING 


141 


one can profess to adjust his observations. But, of course, the 
determination of useful values of the parameters and their 
standard errors is often the prime purpose of an investigation. 

Remark 4- The method of least squares is the only analytic 
device for curve fitting that takes account of the weights of 
the observations. If both x and y are subject to error, it is 
necessary that both coordinates be given their proper weight¬ 
ing at every point. In graphical methods of curve fitting, the 
eye can be trained, to some extent, to take account of weights. 

Remark 5. In the next chapter we shall see that it is possible 
to compute 

S = L ( W X V X 2 + WyV y 2 ) (15) 

without calculating the individual x and y residuals, nor squar¬ 
ing, weighting, and adding them. However, it is often found 
worth while to draw the calculated curve, and to lay off the x 
and y residuals, to be able to note whether any of them is espe¬ 
cially large. In this manner, it is sometimes possible to discover 
sources of spurious observations that would be hidden in a com¬ 
prehensive test like the chi-test. 

67. The distribution of X 2 - The least squares value of 

X 2 = 4 L + VyV v 2 ) (16) 

cr 

for a fitted curve (provided it is the right curve, and the observa¬ 
tions are random) has the probability distribution 7 

P(x 2 )dx 2 = dx 2 (17) 


wherein 

k = number of points — number of adjustable parameters (18) 

k is commonly called the “ degrees of freedom.” It was recognized 
by Gauss, though he gave it no particular name (cf. footnote 6 in 
Ch. II). The effect of including both x and y residuals in x 2 is 
merely to add the second term in the summation in Eq. 16. The 
form of the distribution is unaffected. 

7 See an article by the author in the J. Amer. Slat. Assoc ., vol. 29, 1934: 
pp. 372-382. A necessary lemma thereto is given in the Phil. Mag., vol. 19, 
1935: pp. 389-402. 



142 


CONDITIONS CONTAINING PARAMETERS [Sec. 58] 


Historical note . The distribution of x 2 for problems in curve 
fitting where y alone is subject to error was first published by 
P. Pizzetti in an article entitled “ I fondamenti matematici per 
la critica dei risultati sperimentali,” Atti della Regia University di 
Genova ) vol. xi, 1892: pp. 113-333. Helmert’s distribution of 
s for the simplest problem in curve fitting (Sec. 9), arrived at 
also many years later by Student in 1908, can easily be converted 
into the distribution of x 2 - Helmert gave this distribution in 
Schlomilch’s Zeitschrift fur Math . und Physik , vol. 21, 1876: 
pp. 300-3, and it is interesting to note that Pizzetti referred to 
Helmert’s work. Helmert’s derivation is reproduced in 
Emanuel Czuber’s Beobachtungsfehler (Teubner, 1891), pp. 147- 
150. Pizzetti's result was generalized by the author for errors 
in both the x and y coordinates, in the paper referred to in 
footnote 7. The assumption of normally distributed observa¬ 
tions was presumed. 

There is no such thing as a distribution of x 2 unless the 
fitting is done by least squares; in other words, only the mini¬ 
mized x 2 has a distribution. 


68. Some geometry concerning the adjustment of observations. 

Now let us consider some of the details connected with the calcu¬ 
lated points, or the adjusted observations. Let Q be the line 
segment joining the observed and calculated points. By Eqs. 12 
we;can find the slope of this line segment; it is 

dF 

__ the y residual V y w x F y w x dy 

°* >e ° the x residual V x w y F x w y dF 

dx 


w x dx 
w y dy 

(19) 


The last step here involves the very important relation learned 

dF 

in elementary calculus, that if F(x, y) = c, then — = — ^ = — zr* 

dx dF F v 


Eq. 19 says that 


The slope of Q 


Wx _ 1 _ 

w y the slope of the curve 


( 20 ) 




[Ch. VIII] 


CURVE FITTING 


143 


Hence if w x = w v at point h, the two slopes are negative reciprocals 
of one another, and, if the x and y scales are equal, the line segment Q 
will he perpendicular to the curve (but see Exercise 1 at the end of 
this section). 

If at any point, w x : w v = » or is very large, which is to say 
that X is relatively infallible, then the line segment Q is vertical 
and the adjustment is all in the y coordinate. An exception may 
occur in the neighborhood of any portion of the curve that is 
vertical or nearly so; there the derivatives F x and F y usually affect 
the normal equations and Eq. 20 in such a way that the curve is 
brought close to the observed point, and the line segment Q is 
drawn away from the vertical. 



Fia. 18. A portion of Fig. 17 redrawn for further consideration. 

If at any point, w y : w x = oo or is very large, which is to say that 
Y is relatively infallible, then the line segment Q is horizontal and 
the adjustment is all in the x coordinate. An exception may occur 
in the neighborhood of any portion of the curve that is horizontal 
or nearly so; there the derivatives F x and F y usually affect the 
normal equations and Eq. 20 in such a way that the curve is brought 
close to the observed point, and the line segment Q is drawn away 
from the horizontal. 

If at any point, w x : w y is finite, i.e., both X and Y are subject to 
error, the line segment Q will be neither horizontal nor vertical, 




144 CONDITIONS CONTAINING PARAMETERS [Sec. 58] 

but inclined, and there is adjustment in both the x and y coordi¬ 
nates. Exceptions occur. In the neighborhood of any vertical 
portion of the curve, the line segment Q may be pulled to a nearly 
horizontal position, and, in the neighborhood of any horizontal 
portion of the curve, the line segment Q may be pulled to a nearly 
vertical position. 

Exercises 

Exercise 1 . Suppose that the x and y scales are equal on the 
graph of a certain curve, and that, at one of the points, the weights 
of X and Y are equal, wherefore the line segment Q in Fig. 18 
(p. 144) is perpendicular to the curve at that point. Then suppose 
that the graph is redrawn, and that the units in which Y is measured 
are changed, as from feet to inches, while the units of X remain 
unchanged. 

(а) Prove that the line segment Q is no longer perpendicular to 
the fitted curve. {Hint: The ratio w y : w x was unity before the 
change in scale, but it is not so afterward. If all the y coordinates 
are multiplied by C, because of the change in units, then the weights 
of all y observations are decreased by the factor 1/C 2 , and the new 
value of w x : w y is C 2 times the old one. Moreover, the new slope 
of the curve is C times as great as before. By Eq. 20, the slope of 
the line segment Q is also C times as great as before, and it follows 
that Q is no longer perpendicular to the curve.) 

(б) Show that the change in the units of measuring Y affects 
the normal equations only in such a way that the y coordinates of 
the calculated curve and the adjusted points are all multiplied by C. 
(Thus any change in units is automatically taken care of by the 
normal equations. This is in contrast with the arbitrariness of 
curve fitting by eye, by which very different results may arise 
merely from a change in units.) 

Exercise 2 . When there are three coordinates, the surface 
F{x, y, z ; a, b, c) = 0 

is to be fitted to the n observed points. L then contains three 
terms — the two already written in Eq. 8 plus F z F z /w z . Show 
that if x y y, z are observed with equal weight at any point, the line 



[Ch. VIII] 


CURVE FITTING 


145 


segment Q joining the observed and calculated points is normal to 
the fitted surface. 8 In such a problem, the calculated points lie on 
the calculated (fitted) surface. See Section 81 for an example in 
three dimensions, and Exercise 26 of Section 71 for one in four 
dimensions. 

Exercise 8. Show that the minimized sum of squares, S, can be 
written as £ WF 0 ' 2 , wherein Fq is the left-hand side of Eq. 1 
(p. 130) evaluated with the observed coordinates X , Y f and the 
calculated parameters a, b, c; and W is the weight of Fq (cf. 
Remark 3 at the end of Sec. 54). 

From the way Fq' and W are defined, it turns out that 
Fq' = F(X } Y; a, b, c ) 

= F o — F a A — FbB — F c C (Neglecting residuals 

of higher order) 

= F X V X + FyVy (See Eq. 7, p. 53.) (21) 

and 

W — y- at point h (8) 

L 

The solution to the problem then lies in writing, as usual, 

S = Z (w x V x 2 + w v V v 2 ) 
and noting that from Eqs. 10 (p. 136) 

X = WFq at point h (22) 

whence Eqs. 12 give 

1 * 

V x = — WF 0 'F X 
w x 

j f (23) 

Vy = — WFo'Fy 

Wy ) 

for the x and y residuals at point h. Substitution into w x V x 2 
-f WyVy 2 gives the required result in terms of Fo' (due to Kum- 
mell, 1879). This result is useful in Exercise 3 of Section 61, 
page 163. 

Exercise 4- Prove that W in the preceding exercise is actually 
the weight of F 0 ', i.e., of F(X, Y; a, b, c). (Hint: Apply Eq. 9, 

8 This result was proved by the author in the Phil. Mag. } vol. 11, 1931: pp. 
146-156. 



146 


CONDITIONS CONTAINING PARAMETERS [Sec. 58] 


p. 40.) Hence the new expression £ WF 0 ' 2 for S can be regarded 
as a sum of the weighted squares of residuals, Fq now being defined 
as a new kind of residual. 

Exercise 6. If the x coordinate is free of error, WF 0 ' 2 is equal to 
<v y V y 2 , and, if Y is free of error, WF 0 ' 2 is equal to w x V 2 . 

Exercise 6. Prove that for any observed point in the neighbor¬ 
hood of which the slope of the fitted curve is positive, the residuals 
V x and V y will have opposite signs; but, if the slope is negative, 
then V x and V v will have the same sign. In other words, when 
the fitted curve lies below the observed point, then the calculated 
point lies below and to the right of the observed point if the slope 
of the fitted curve is positive, but below and to the left if the slope 
is negative; and when the fitted curve lies above the observed 
point, then the calculated point lies above and to the left if the 
slope is positive, above and to the right if the slope is negative 
(see Fig. 19). 



Fig. 19. Showing the possible and impossible positions of the calculated 

point. 

Exercise 7 . Show that in fitting 

y = a + bx + cx 2 (24) 

with y alone subject to error, Eqs. 10 reduce to 

X = w v { Y — (oq + box + cox 2 ) - {A + Bx + Cx 2 )} (25) 

- w v (Y - y) (26) 







[Ch. VIII] CURVE FITTING 147 

whence Eqs. 12 and 13 on page 138 reduce to 



= Y - {Y - (oo + box + cox 2 ) - (A + Bx + Cx 2 )} (27) 

or 

y = a + bx + cx 2 

In this circumstance, therefore, y may be calculated at any point 
merely by substituting the x coordinate into the equation with the 
adjusted values of a, b, c. 



CHAPTER IX 


SYSTEMATIC COMPUTATION FOR FITTING 
CURVES BY LEAST SQUARES 

69. Preliminary note on the tabular solution. In the systematic 
solution of the normal equations for geometric conditions, given 
on pages 82 and 83, we saw an easy way of computing the mini¬ 
mized sum of squares, S. In Section 61, we shall see that the same 
routine for solving the normal equations in curve fitting will also 
yield S. We shall, moreover, see how our initial approximation 
to the sum of squares is diminished as one parameter after another 
is adjusted. 

In order to gain some preliminary familiarity with these charac¬ 
teristics of the routine, we shall return to the simple illustration 
considered in Section 10, where we had the n observations and 


Observations and weights (columns 1 and 2) 
Computations for finding x and s (columns 3 and 4) 


(1) 

Observation 

(2) 

Weight 

(3) 

Weighted deviation 
from oo 

(4) 

Weighted square 
of the deviation 
from oo 

Xl 

W\ 

w x (xi - o 0 ) 

Wi(Xi - a 0 ) 2 

X2 

U>2 

W 2 (X 2 — Oo) 

W 2 (X 2 — o 0 ) 2 

X 3 

U)3 

W 3 {X3 - O 0 ) 

Wz{Xz — o 0 ) 2 

Xn 

w n 

W n (x n - o 0 ) 

V>n(x n - Oo ) 2 


Wtd. av. 

= E^-ao) wtd _ 

L w 

m P 

2 w(x - o 0 ) 2 
av. - ^ 

2— w 

35 Q 


148 





[Ch. IX] COMPUTATION FOR FITTING CURVES 149 

weights listed in columns 1 and 2 of the table on page 148. The 
weights could be merely the relative frequencies of occurrence. 

The problem now is the same as it was in Section 10, namely, to 
find what value of the parameter a in the equation 

x = a (1) 

renders the (weighted) sum of squares, S, a minimum. Now, 
however, we shall start off differently, because we wish to pattern 
the solution after the framework to be explained in Section 61 for 
more complicated problems. We shall use an approximation do 
for a, and shall correct it later by finding the residual A, which, 
when subtracted from a 0 , gives the final value of a (turn back to 
Eqs. 6 on p. 52). 

Corresponding to Eqs. 11 in Section 55 (the normal equations 
for curve fitting), there will here be one and only one normal equa¬ 
tion, with one unknown. We proceed to calculate the one and 
only L coefficient therein; also the right-hand member. 

For the present problem we set 

F = x - a (Eq. 1 of Ch. VIII) 

whereupon 

Fq — Xh — &o (For observation No. h) 

The derivatives are 

F x = 1, F a - -1 (Turn to Eq. 5 of Ch. VIII.) 

whence 

L = = ~ (Eq. 8 of Ch. VIII) 

W x W 

The right-hand member of the one and only normal equation will be 

= ~~ £ w ( x ~ a °) 

The sum of squares formed with a 0 will be 


E w(x - do) 2 



150 


CONDITIONS CONTAINING PARAMETERS [Sec. 59] 


which, be it noted, can be written as [F 0 Fo/L], a symbol that in 
more complicated problems denotes the sum of squares formed with 
the approximate values (a 0 , b 0) c 0 ) of the parameters. In Sec¬ 
tion 61 and beyond, this symbol is abbreviated to [oo]. 

We now make substitutions into Eqs. 11 on page 136. Row I 
constitutes the one and only normal equation. For reasons that 
may become clear later, we introduce Row 2, containing in the 
“ 1 ” column the sum of squares formed with the approximation a 0 . 
Rows 3 and II are formed by the manipulations described in the 
column “ How obtained.” 


Row 

A = 1 


I 

Z w — Z w ( x — °o) 


2 

Z w i x — Oo) 2 

How obtained 

3 

- {Z w(z - °o)| 2 /Z ™ 

Multiply I by 

+ z w\x - ao)/Z w 

II 

Z te(z — ao) 2 

Add 2 and 3 


- {Z v>(x - a 0 )} 2 / Z 

w 


Solving Row I for A, we get 

, L w(x - a 0 ) 

2 * w 


( 2 ) 


whereupon 

a = a® — A (Cf. Eqs. 6, p. 52.) 

Z w(x — a 0 ) 


= Oq + 


Z w 


. Z wx 

Oo T ^-Oo 


z W 


Z gg 

Z w 


= 2 


(By definition of x) 


(3) 


The least squares value for a is thus x , the weighted mean of the 
observations, as was obtained by the direct solution in Section 10a. 
The extreme left entry in Row II is none other than S f the 




[Ch. IX] COMPUTATION FOR FITTING CURVES 


151 


minimized sum of squares. This is so because 


(The extreme left entry 

. „ TI1 , \2 f2>(x-a 0 )) 2 

m Row II) = 22 w(x — ao)-—- 

22 w 


= J2wx 2 - 


( 22 wx ) 2 

£ w 


= 22 wx 2 — x 2 22 w 


(4) 


which is the value of S shown by Eq. 11 in Section 10a, page 19. 


This tabular solution will be extended later in this chapter to the 
calculation of parameter residuals (called A , B, C ) and the mini¬ 
mized sum of squares, S } in more complicated problems in curve 
fitting. 


At this point, it is interesting to perceive that the tabular solu¬ 
tion just described is equivalent to a certain rapid method, often 
used in statistics, for computing the mean x and the standard devi¬ 
ation s of a set of n observations such as those shown in column 1 
above. The method will be described in steps. 

i. Select an arbitrary datum, perhaps a rounded-off guess 
at x, which takes the place of the approximation a 0 in the tabular 
solution just described. 

ii. Write down in column 3 the deviations of the observations 
from ao, and weight them. 

iii. Form the squares of these deviations, weight them, and 
enter the weighted squares in column 4. 

iv. Take the weighted averages of columns 3 and 4. Call 
these averages P and Q. They are the correction factors to be 
used in finding x and s 2 , according to Eqs. 5 and 6, ahead. 

The weighted mean and standard deviation of the n observations 
are then calculated by writing 


* = Oo + P 

(5) 

s 2 = Q - P 2 

(6) 


Now the correction factor P may be considered either as the 
average residual reckoned from the arbitrary datum ao, or as the 



152 CONDITIONS CONTAINING PARAMETERS [Sec. 60] 

distance between x and a 0 . Q is the average squared residual, 
when the residuals are reckoned from ao, and s 2 is the average 
squared residual, when the residuals are reckoned from a. In 
words, Eqs. 5 and 6 state that 

x = (the arbitrary datum ao) + (the average residual 

reckoned from ao) (7) 

8 2 = the average squared residual reckoned from a 
= (the average squared residual reckoned from Oq) 

— (the distance between a and a 0 ) 2 (8) 

It can be seen from Eq. 4 that the extreme left entry in Row II 
(p. 150) divided by 2 ^ is 

Q ~ P 2 

which is none other than s 2 . Therefore the (minimized) sum of 
squares S is just {J^w)s 2 . Hence the tabular scheme shown 
above for calculating A and S is equivalent to the steps outlined 
for the rapid method for getting x and s 2 . 

It is important to note that the entry ^w{x — a 0 ) 2 in Row 2 
in the “ 1 ” column arises by summing squares of deviations 
reckoned from a 0 , and the quantity in Row 3 just below it is pre¬ 
cisely the amount by which ^w{x — a 0 ) 2 must be diminished to 
get the minimum sum of squares, called S. Likewise, the quantity 
Q arises by summing squares of deviations reckoned from a 0 , and 
P 2 is the amount by which this sum of squares must be diminished 
to get s 2 (see Eq. 6). 

In this example, both in the tabular solution and the “ rapid 
method ” for computing x and s 2 , the number a 0 need not be close to 
a. It can have any value whatever, but in practice it will usually 
be a rounded-off guess at the mean (which is the final value of a). 

60. Systematic procedure for forming the normal equations for 
the parameters. There will be a formula to be fitted. It might be 

y = a + bx + cx 2 (9) 


bx 


or it might be 


y = ae 


( 10 ) 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


153 


or it might be something else. Whatever it is, we can transpose 
one member and write it in the form 

F(x, y; a , 6, c) = 0 (Same as Eq. 1, p. 130) (11) 

Thus, for Eq. 9 above, F would be y — (a + bx + cx 2 ), and for 
Eq. 10, F would be y — ae bx . 

The reader may recall the use of the symbol F 0 in the preceding 
chapter (Sec. 52). F 0 stood for the value of the function F at some 
particular observed point X, Y (not a calculated point), and evalu¬ 
ated furthermore with the approximate parameters cio, 6 0 , c<). For 
instance, for Eqs. 9 and 10, having fixed the form of the function F 
in the manner just described, the numerical value of F 0 at the 
(observed) point X, Y would be Y — (a 0 + b 0 X + c 0 X 2 ). For 
Eq. 10, F 0 would be Y — aoe b ° x . 

In fitting a function by least squares, the first thing to do is to 
fix the form of the function F by transposing all terms of the for¬ 
mula to one side of the equation, to get it in the form of Eq. 11. 
The steps then to be followed are outlined below. It is interesting 
to compare these steps with those of Section 33, wherein there 
were no parameters. 

1st step, (a) Work out somehow satisfactory approximations 
ao, bo, Co for the parameters (cf. the reduced type in Secs. 25 and 
55); (6) calculate numerical values of F 0 at every point. 

In some problems, depending on the formula and the weighting, 
it is permissible to take ao = bo = Co = 0 when calculating F o 
(but not L), in which event the residuals A, B, C turn out to be 
the adjusted values —a, — b, —c themselves (cf. Exercises 4, 5, and 
10 in Secs. 65-6 ). But this is not usually advisable even when per¬ 
missible. As a matter of saving time, a good rule is to commence the 
adjustment with as good approximations ao, bo, Co as can be found 
with a reasonable amount of trouble, and thus to cut down the 
number of figures required in the formation and solution of the 
normal equations. 

2d step. This step requires some differential calculus. It 
consists of writing down the various derivatives of F, namely 

Fa, F h , F c , F xt and F y 



154 


CONDITIONS CONTAINING PARAMETERS [Sec. 60] 


The first three may be needed in forming the normal equations, the 
last two for calculating 

L= ^ (Same as Eq. 8, p. 134) 

W X Wy 

and the summation required at the n points. L may vary from 
point to point, and some or all of the derivatives F ai Fb, and F c 
almost surely will. So will F 0 . 

Sd step. Work out the numerical values of F a , Fb, F c , and L, at 
every point. The following tabulation is suggested. 


TABLE 1 

Preliminary to the matrix (3d step) 



Of course auxiliary columns may be required, depending on the 
problem and the whims of the computer. Or, perhaps some 
columns listed will not be needed, e.g., if w x were oo all the way 
down (x free from error) then F x and w x would be omitted, since 
y alone would contribute to L, which would be merely FyF v /w v . 
Likewise, if y were free from error all the way down, then the F v 
and w v columns would not be needed, for then L would be simply 
F x F x /w x . 

4th step. Divide each entry under F a , Fb, F c , and F 0 by the 
corresponding VL. The sums at the right or bottom (one but not 
both) of Table 2 can be formed by cumulating these quotients in 
the horizontal or vertical, the individual quotients being entered in 
the table. (This cumulation requires a machine with a double 
multiplying dial, one to be locked for cumulating quotients, while 





[Ch. IX] COMPUTATION FOR FITTING CURVES 


155 


the other clears when desired. See a remark following Table 2 in 
Sec. 33, p. 72.) 

TABLE 2 


The matrix for the formation of the normal equations 1 (4th step) 



The sums at the right and along the bottom are used for checking 
the formations of the normal equations exactly as was done with 
Table 2 in Section 33. First of all, the sum across the bottom 
should equal the sum down the right-hand side, as indicated by 
the check mark. In running down the columns, cumulating squares 
and cross-products (the fifth step, p. 156), the final total in the multi¬ 
plier register will equal the sum at the bottom of the multiplier 
column provided no changes in sign occur in the multiplicand 
column. In a machine with a double multiplier register, one part 
of which can be locked for cumulation while the other clears, 
individual multipliers can be checked at will in one dial, while 
the sum of the multipliers cumulates in the other one for checking 
at the bottom. 

A maximum of three or four significant figures in any column 
will suffice. This means that if there is great variation in the 
sizes of the numbers in any column, some entries in Table 2 may 
have only two, or one, or not even any figures; see, for instance, 
pages 213 and 224; also page 79. 

The denominations of the different columns should be made 
uniform by writing powers of 10 at the top of each row, to apply 
to the whole column (see the solved examples at the end; also the 
one in Sec. 34). No attention need be given to the powers of 10 
until the end, when the solution of the normal equations is decoded. 

1 Concerning the use of the term matrix here, see the note appended to 
Table 2 in Ch. VI, p. 72. 






156 


CONDITIONS CONTAINING PARAMETERS [Sec. 61 ] 


5th step. Form the normal equations from Table 2 by the famil¬ 
iar process of adding squares and cross-products of columns. 
Thus, no matter how complicated the weighting, and no matter 
what be the form of the fitted curve, the whole procedure is uni¬ 
form, and we are brought to a uniform and familiar process for 
the formation of the normal equations. 

As already suggested at the commencement of this section, the 
student should compare this matrix with the previous Table 2 of 
Section 33 (p. 72), which arose in the consideration of conditions 
not containing parameters. The headings in the tables are different 
there, of course; but to the computer, the routine procedure of 
forming the normal equations from Table 2 is the same here as it was 
there. Also, the routine of solution is the same (compare Secs. 34 
and 61). The exercises in Chapter X will provide practice in the 
necessary steps for setting up the normal equations for several 
types of functions. 

Remark. By the procedure here explained for the formation 
of Table 2, whence the normal equations are to be set up, the 
solution of the normal equations (i.e., the values of A, B } C, 
etc.) is unequivocal. That is, it does not matter in what form 
the equation to be fitted is written. If one had, for example, 
y = aeP x , he could put the same equation in the form In y — In a 
+ bx, using in the former case F — y — ae bx and in the latter 
case, / = In y — In a — bx. Further illustrations will occur in 
the exercises of Chapter X. When the normal equations are 
made up according to the steps outlined above, the results will 
be the same from any form of the fitted equation, to within higher 
powers of the residuals. 

As another example, the straight line can be written as y = a 
+ bx or as x = —a/b + y/b, and F may be y — a — bx or 
x + a/b — y/b. Either way, the results will be the same to 
within higher powers of the residuals. Summed up, the re¬ 
sults — the final calculated parameters, and the adjusted observa¬ 
tions, are independent of the form in which the equation is written. 

Very large residuals, i.e., very rough data, will invalidate this 
statement to some extent, but if the data are as rough as that, 
they may not be worth fitting anyhow. See some other remarks 
in Section 26, also in Exercises 18 and 23 of Chapter X. 

61. Systematic solution of the normal equations. The recip¬ 
rocal matrix. Systematic computation of S. As has just been 
noted, the sums of squares and cross-products occurring in the 
normal equations for curve fitting (p. 136) are formed directly from 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


157 


Table 2 of the preceding section. Thus, the summation [ F a F a /L ] 
on page 136 is the sum of squares under the column of Table 2 
headed F a /\/L ; the summation [F a Fb/L] is the cumulation of 
cross-products under the columns of Table 2 headed FJ's/L and 
Fi/VL; [F a F 0 /L] is the cumulation of cross-products under the 
columns headed F a /y/L and F 0 /VL; etc. 

We shall suppose that the normal equations have been formed 
in this manner. The numerical values of the squares and cross- 
products called for on page 156 will be entered as numbers in Rows 
I, 2, 3, 4, page 158. On this page, the abbreviated symbols 


[i aa ] 

for 

ffl 

[ab] 

for 

m 

[ 00 ] 

for 



etc., have been introduced for convenience. The unit matrix in 
the columns C i, C 2 , C 3 is entered for the calculation of the recipro¬ 
cal matrix, and the sums at the right are formed for checking. The 
Gauss symbols [bb. 1 ], [cc.2], etc., seen in Rows II and III, will 
facilitate reference to certain entries later on, as in the exercises 
beginning on page 161. 

The solution proceeds according to the operations of multipli¬ 
cation and addition indicated by the directions under the column 
headed “ How obtained.” The procedure here outlined is similar 
to Doolittle’s 2 solution, which in turn goes back to Gauss . 3 The 
check marks show the “sum check ” at the pivotal points. The 
normal equations on pages 82 and 83 were solved this way. Fur¬ 
ther numerical examples occur in Chapter XI. Note that A is 
eliminated in Row II; A and B are both eliminated in Row III. 
The values of the parameter-residuals appear in Rows 11, 12, and 
13, in the “ 1 ” column. 

2 M. H. Doolittle, Coast and Geodetic Survey Report for 1878 (Washington), 
App. 8, pp. 115-118. 

3 Gauss, Supplementum Theoriae Combinations (Gottingen, 1826; Werke, 
vol. 4), Art. 13. 



158 


CONDITIONS CONTAINING PARAMETERS (Sec. 61] 


Row 11 comes by dividing III through by [cc.2] to get C. 
Row 12 comes by substituting from 11 into II to get B. 

Row 13 comes by substituting from 11 and 12 into I to get A. 
This is the “ back solution.” 


The normal equations and their solution 

Unknown* 


Row A 


B 

C * 

= 1 

Cl 

c 2 

Cz 

Sum 

I 

[<KJ] 

lab] 

[ac] 

[ao] 

1 

0 

0 

... V 

2 


[bb] 

[be] 

[bo] 

0 

1 

0 

... V 

3 

How 



[cc] 

[CO] 

0 

0 

1 

... V 

4 

obtained 




[oo] 

0 

0 

0 

... v 

5 

I X - 

[afc] 2 

foc][o6] 

[<zo][a6] 

lab] 





m/[aa] 

[aa] 

[aa] 

[aa] 

[aa] 

0 

0 


II 

2+5 

[66.1] 

[6c.1] 

[6o. 1] 

1 

III 

1 

0 

... V 

6 

I X — [ac]/[aa] 

~ [ac] 

[ac] 

[aa] 






7 II X- 



-,[6c.l] 







[bc.l]/[bb.l] 

" 16C1 W 




0 


III 

3 + 6 + 7 

[cc.2] 

[co.2] 



1 

... v 






[ao] 2 

[ao] 

0 



8 

I X -[ao]/[oa] 


[aa] 

[aa] 

0 


9 

II X -[6o.l]/[66.1] 


[bo.l] 2 

[66.1] 


[60.1] 

[66.1] 

0 







[co.2] 2 



[co.2] 

10 

III x — [co.2]/[cc.21 


[cc.2] 



[cc.2] 

IV 

4 + 8 + 9 + 10 


S 




... V 

13 

I solved for A 



A 

cn 

CI2 

Cl3 


12 

II solved for B 



B 

C21 

C22 

C23 


11 

III solved for C 



C 

C3I 

C32 

C33 

... V 


Note . The ellipsis (• • •) in the tabular array denotes a space wherein a 
number would ordinarily be entered in numerical calculation, but in which it 
is not worth while to show the entry in symbols. 



[Ch. IX] 


COMPUTATION FOR FITTING CURVES 


159 


The reciprocal matrix is set off in Rows 11 , 12 , and 13, in the 
columns headed C\, C 2 , C 3 . The entries cn, c 2 1 , C 31 , in the C\ 
column are the values that would be obtained for A, B, C if the 

1 

right-hand members of the normal equations were 0 , as found in 

0 

the Ci column. Likewise, the entries ci 2 , c 22 , C 32 in the C 2 column 
are the values of A, B, C that would be obtained for A, R, C if the 

0 

right-hand members of the normal equations were 1, as found in 

0 

the C 2 column. Similar remarks hold for the entries C 13 , c 2 3 , C 33 
in the C 3 column. The values of the elements of the reciprocal 
matrix, in terms of the coefficients comprising the normal equa¬ 
tions, are given in Exercise 2 a, following this section. 

Remark 1. Many variations of the procedure shown on 
page 158 have been published. Each possesses merits peculiar 
to the machines available, preference of the operator, and 
other circumstances. The computer should be expected to 
develop variations that are advantageous to the peculiar re¬ 
quirements and conditions under which he works, and to his 
likes and dislikes. 

Remark 2. Methods of solution quite different from that 
described above have been contrived, but not yet adapted to 
mass production. Some of them are devices for calculating 
the reciprocal matrix to be used as a multiplier, for example, 

T. Smith’s, 4 and a very promising scheme of matrix squaring 
devised by Hotelling and Girshick on the basis of a theorem 
regarding the characteristic equation of a determinant. 5 In an¬ 
other direction there is Kelley and Salisbury’s 6 ingenious accel¬ 
eration of an iterative process usually known as Seidel’s (1874), 
though described earlier by Gauss and Jacobi, 7 the same being 

4 T. Smith, “ The calculation of determinants and their minors," Phil. 
Mag., vol. 3, 1927: pp. 1007-9. 

8 This was published by M. D. Bingham, J. Amer. Stat. Assoc., vol. 36, 
1941: pp. 530-634. 

6 Truman L. Kelley and Frank S. Salisbury, J. Amer. Stat. Assoc., vol. 21, 
1926: pp. 281-292. 

7 Whittaker and Robinson, Calculus of Observations (Blackie & Son, 1924), 
Art. 130. 



160 


CONDITIONS CONTAINING PARAMETERS [Sec. 61] 


particularly effective when good initial approximations are 
available. Then there is a fascinating pivotal process invented 
by Aitken 8 in 1932, after T. Smith’s method; he has now, 
however, superseded this solution by the introduction of a 
number of important unpublished refinements. It is also 
interesting to note that electrical circuit machines, capable 
of solving something like 10 linear equations, practically instan¬ 
taneously after plugging the coefficients, are in operation and 
undergoing further development at several centers. 

Remark 3. The reciprocal matrix contains the variance and 
product variance coefficients for the parameters a, b, c. Its 
use in this connexion will be illustrated in Section 62. 

Remark 4- The reciprocal matrix has also another use, 
namely, as a multiplier for finding the unknowns in the normal 
equations, in the same manner in which it was used in Eqs. 23, 

24, and 26, on pages 93 and 94. The theory of the reciprocal 
matrix as a multiplier originated with Gauss. 9 The essentials 
of this theory are contained in the exercises following. 

The “ reciprocal solution,” gotten by using the reciprocal 
matrix as a multiplier, is a very sensitive indicator of instability. 

It is just for this reason that the reciprocal solution is likely 
to break down in the case of near indeterminacy 10 — a fact 
that detracts rather drastically from its usefulness in the solu¬ 
tion of normal equations in curve fitting, where near indeter¬ 
minacy is surprisingly common. Near indeterminacy exists 
when A, the determinant of the coefficients, is very small. 

The freezing of the solution — the near vanishing of one of the 
extreme left coefficients (such as the entry [cc. 2] in Row III) — 
is indicative of near indeterminacy, which is usually but not 
always accompanied by instability. 

With regard to the source of near indeterminacy and the 
remedy, Palmer 11 gives this excellent advice. “ ... it occasion¬ 
ally happens that one of the equations is so nearly a multiple or 

8 A. C. Aitken, “ On the evaluation of determinants, the formation of their 
adjugates,” Proc. Edinburgh Math. Soc. t vol. 3, 1932: pp. 207-219. 

9 Gauss, Supplementum Theoriae Combinationis Erroribus Minimis Obnoxiae 
(Gottingen, 1826; Werke , vol. 4), Art. 8. 

10 See a paper by the author in Science , May 7, 1937; also Henry Schultz, 
The Theory and Measurement of Demand (Chicago, 1938), pp. 761-3. Ap¬ 
pendix C can be highly recommended for techniques of curve fitting. 

11 A. de Forest Palmer, The Theory of Measurements (McGraw-Hill, 1912), 
p. 77. This book, by the way, is one of the best on experimental science and 
scientific inference. 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


161 


submultiple of another that an exact solution becomes difficult 
if not impossible. In such cases the number of observation 
equations may be increased by making additional measurements 
on quantities that can be represented by known functions of the 
desired unknowns. The conditions under which these measure¬ 
ments are made can generally be so chosen that the new set of 
normal equations, derived from all of the observation equa¬ 
tions now available, will be so distinctly independent that the 
solution can be carried out without difficulty to the required 
degree of precision.” 

Remark 5. An important consideration in the solution 
of equations is the maximum error in the values found for the 
unknowns — maximum error, not just the average or standard 
error — arising from errors in the coefficients. Tuckerman 12 
shows a simple procedure by which this maximum error can be 
determined. 


Exercises 


Exercise 1. (a) The determinant of the coefficients of the 

normal equations on page 158 can be evaluated as 


A = 


[aa] [ab] [ac] 

lab] [bb] [be] 

[ac] [be] [cc] 


= [aa] [55.1] [cc. 2] 


( 12 ) 


which is to say that the determinant A is the product of the extreme 
left numbers in the Roman-numbered Rows I, II, III. This result 
is important, because it shows that in near indeterminacy, i.e., 
when A is small, one of these factors on the right will be small. 
The so-called phenomenon of freezing (the vanishing of [55.1] or 
[cc.2]) is thus associated with a small determinant, which usually 
but not always gives rise to instability. (See the reference to 
Tuckerman below.) 


Hint: Ohio’s pivotal expansion will be found admirably 
suited to the demonstration of Eq. 12. The work might pro- 


12 L. B. Tuckerman, Annals of Math. Statistics, vol. 12, 1941: pp. 307-316. 



162 


CONDITIONS CONTAINING PARAMETERS [Sec. 61] 


ceed as follows, the pivot element being unity in the upper 
left corner. 


[aa] [a6] [ac] 
lab] [66] [6c] 
[ac] [6c] [cc] 


= [aa] 


= [aa] 


= [aa] 


fa*] 


, M 

[аа] 
[aft] [66] 
[ac] .[6c] 

[аб] 


M-wg 

[66.1] [6c.1] 

M M-wg 


[ac] 

[aa] 

[6c] 

[cc] 


= [aa] [66.1] 


1 

[6c.l] 


[6c.1] 

[ 66 . 1 ] 


[cc] - [ac] 


= [aa] [66.1] [cc.2] 


( 6 ) Show that none of the extreme left entries in Rows I, II, 
and III can be negative. (Hint: Make use of Sec. 29. Or, use the 
Schwarz-Christoffel inequality.) 


Exercise 2. 


The matrix reciprocal to A can be denoted by 


Cll 

C12 

Cl3 

C21 

C22 

C23 

C31 

C32 

^33 


(13) 


(a) Show that the solution of the normal equations with the 
constant columns C\, C 2 , and C 3 leads to the values 


Cll 

l 

C 21 


cof. of [aa] 
A 

cof. of [q5] 
~~ A ~~ 


C12 

c 2 2 


cof. of [a 6 ] 

- y 

A 

cof. of [ 66 ] 

-, 

A 


CIS 


C 23 


cof. of [ac] 
A 

cof. of [ 6 c] 
A 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


163 


cof. of [ac] cof. of [6c] cof. of [cc] 

C31 = - 7 - ’ C32 = -:- ’ C33 =--- 

AAA 

The abbreviation cof. denotes cofactor. 

Since A falls in the denominator of each element, a small value of 
A, near indeterminacy, results in high standard errors of the 
parameters; see Exercises 2 a and 126 of Chapter X. 

(6) Like the determinant A of the coefficients, the reciprocal 
matrix A -1 is symmetrical, i.e., c\ 2 — c 2 i, C13 = C31, C23 = c 32 . 

(c) The matrices A and A -1 are also alike in another respect — 
the terms on the main diagonal will always be positive. 

(( d ) Show that C33 = l/[cc. 2 ] = the reciprocal of the coefficient 
of the third unknown in Row III. 


Exercise 3 . (a) Combine Eq. 17 on page 57 , and Eqs. 10 on 

page 136 to get 

S = [00] — [ao]A - [6o]£ - [co]C ( 14 ) 

Show also that 

[аа] [ab] [ac] [ao] 

[аб] [66] [6c] [60] 

[ac] [6c] [cc] [co] 

[ao] [60] [co] [00] 

S =- ( 15 ) 

[аа] fa6] [ac] 

[аб] [66] [6c] 

[ac] [6c] [cc] 

Hint: Expand the numerator and get 

[06] [ac] [ao] [aa] [ac] [ao] 

[66] [6c] [60] [a6] [6c] [60] 

[6c] [cc] [co] [ac] [cc] [co] 

S = -[ao]- + [60] --- 

A A 


[аа] [ab] [ao] 

[аб] [66] [60] 

[co] M M M 

A 




164 


CONDITIONS CONTAINING PARAMETERS [Sec. 61] 


which reduces to Eq. 14 when it is observed that the coefficient 
of — [ao] is none other than A, the coefficient of [6o] is — 5 , and 
the coefficient of — [co] is C. 

(b) Prove that the extreme left entry in Row IV of the solu¬ 
tion exhibited on page 158 is actually S. Thus, the minimized 
sum of squares of the residuals comes automatically in the routine 
of the solution. 

(c) Show also, by noting how Rows 8, 9 , and 10 are formed, that 

S = [oo] - Mi" 2 - [bb.l)B' 2 - [cc.2]C 2 ( 16 ) 

where A" = [ao]/[aa] = the value of A that would be obtained if b 
and c were fixed (not adjustable) at the values 6 0 and c 0 , 
and wherein also B f = [ 5 o.l]/[ 66 . 1 ] = the value of B that would be 
obtained if c were fixed at the value Co, but a and b both adjustable. 

Remark 1 . This result sheds a singular elegance on the form 
of the solution exhibited on page 158 . The term [oo] seen in 
Row 4 is the sum of the weighted squares of the residuals calcu¬ 
lated under the assumption that a — a 0 , b = bo, c = c 0 (see 
Exercise 3 of Sec. 58 , p. 145 ). The three negative terms in the 
“ 1 ” column of Rows 8, 9 , and 10 on page 158 are precisely 
the amounts subtracted from [oo] by the terms on the right of Eq. 

16 , and in the same order. That is to say, by the routine solu¬ 
tion outlined on page 158 there will appear ( 1 °) in Row 8 the 
reduction in weighted squares that is brought about by allowing 
a to be adjustable while b and c are fixed at &o and Co; (2°) in 
Row 9 the further reduction that is accomplished by allowing b 
to be adjustable while c is held at c 0 ; and ( 3 °) in Row 10 the 
final reduction that comes from allowing c to be adjustable, the 
net result being the minimized sum of weighted squares, S, in 
Row IV. 

After a solution has been carried out upon the parameters a, 

6, c, the question often arises, what would have been the result 
for S if the parameter c had not been adjusted, but had been 
fixed at (say) 70? Now if this 70 is not too far from the final 
value of c, one need only add [cc. 2] (c — 70 ) 2 to S in order to see 
what would have been obtained for S had c been fixed at 70 (see 
Examples 1 and 2 of Chapter XI). The value of a 2 {ext) would 
then be S + [cc.2](c — 70) 2 divided by n — 2, not n —3 (n ~ 
the number of points). 

Under certain conditions, the restriction that 70 and c be not 
far apart can be removed; the polynomial y = a -f bx -f ex 2 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


165 


with x free of error is an example. It all depends on whether 
the parameter c enters the L factors of Tables 1 and 2 in Section 
60 . If it does not, then no matter how wide the disparity be¬ 
tween c and yo, the term [cc. 2] (c — 70 ) 2 still represents the incre¬ 
ment in S that would be brought about by adjusting a and 6 to 
the condition c — 70. 

In like manner, and under similar restrictions, a term 
(b — /?o) 2 /c 2 2 will represent the increment in S that would be 
brought about by adjusting a and c to the condition b = /So 
(see Exercise 2d). 

Similarly, the two terms [bb. 1] (b — / 3 0 ) 2 and [cc. 2] (c — 70 ) 2 
added to the S found in Row IV will give what would have been 
obtained for S if only a had been adjusted, b and c fixed at / 3 o 
and 70. In this circumstance, a 2 {ext) would be computed with 
n — l degrees of freedom. 

It is important, as a practical matter, to note that the coeffi¬ 
cients [66.1] and [cc.2], needed for these increments, are already 
at hand , numerically , in Rows II and III in the finished solution, 
page 158 . 

Remark 2. In both parts (a) and (c), S is shown as three 
terms subtracted from [00]. Evidently 

[ao]A + [bo]B + [co]C = [< aa]A " 2 + [66.1]B' 2 + [cc.2]C 2 

(d) From Eq. 16 it follows that if c be changed by the amount 
8 c, while a and 6 remain fixed, the change 8 S in the sum of squares 
obeys the relation 

8 S = / 5 c \ 2 ( 17 ) 

cr 2 \ S. E. of c) 

Exercise J+. Prove that the solution for A } B, and C found from 
the “ 1 ” column will also be given by the equations 

A = [ao]c n + [6 o]ci 2 + [co]ci 3 1 

B - [ao]c 2 i + [6 o]c 22 + [co]c 2 3 > ( 18 ) 

C = [ao]c 3 i + [bo]c 3 2 + [co]c 33 J 

This method of finding the unknowns A, B, and C is called the 

reciprocal solution because the reciprocal matrix is used as a multi¬ 
plier along with the constant (“ 1 ”) column [00], [60], [co]. The 
reciprocal solution is particularly useful when the same coeffi¬ 
cients, hence the same reciprocal matrix, are repeated over and 



166 


CONDITIONS CONTAINING PARAMETERS [Sec. 61] 


over from one problem to another, but with a new constant col¬ 
umn for each problem, and hence with a new set of values for A, 
B , and C each time. See, however, the reference to difficulties 
that may be encountered in near indeterminacy, mentioned earlier 
in this section, also in Example 1 of Chapter XI. Theoretically, 
the direct and reciprocal solutions should agree, and they will if 
the computer carries enough decimals. 

In matrix notation, the results of this exercise can be ex¬ 
pressed as 

Av = H 

where A is the matrix of the coefficients of the unknowns. 

M! 

H is the matrix of the “ 1 ” column, namely, [6o] 

[co] 

and A 

v is the matrix of the three unknowns, namely, B 

C 

The solution of the above equation is 
v = A -1 H 

To evaluate A -1 we set 

Ac = 1 (the unit matrix) 

and find 

c ~ A -1 

Having now the matrix A -1 , we use it as a multiplier with H to 
find the matrix v from the relation above, getting v — cH. This 
is the matrix expression for the results stated in the preceding 
exercise. For illustrations of the reciprocal solution see Sec¬ 
tion 36 and Examples 1 and 2 of Chapter XI. 

Exercise 6. Prove that the values of the determinants A and A -1 
are reciprocals. (A and A -1 defined on page 162 .) 

Exercise 6. (a) If x = r cos 0, y = r sin 0, the two Jacobians 

as matrices, namely, 


dx 

dy 


dr 

dd 

dr 

dr 

and 

dx 

dx 

dx 

dy 


dr 

dd 

dd 

dd 


dy 

dy 



[Ch. IX] 


COMPUTATION FOR FITTING CURVES 


167 


are reciprocals of one another; i.e., their product gives the unit 
matrix 

1 0 
0 1 

(b) Show that a similar relation exists in three dimensions. 

In the derivatives taken with the symbol d, 6 is constant 
while x, y , and r vary, and again r is constant while x, y, and 0 
vary. In the derivatives taken with the symbol d, x is constant 
while r, 0, and y vary, and again y is constant while r, 0, and 
x vary. 

62. The weights of the parameters; their standard errors. 
The standard error of a function of the parameters. The standard 
error of a curve. It is a fact 13 that the reciprocals of the weights 
of the parameters are found on the diagonal of A -1 (see Exercise 2 
of the preceding section), i.e., 

1 1 1 

W a = - > Wb = -» W c — - (19) 

Cll C 2 2 C33 

where 

Cn = var. coeff. of a, c 2 2 = var. coeff. of b, C33 = var. coeff. of c 
Then, since weights are reciprocals of variance coefficients (p. 21 ), 
<r a 2 = Clio- 2 , a h 2 = C 22 <r 2 , o 2 — c Z3 o 2 ( 20 ) 

or 

(S. E. of a ) 2 = c\\o 2 ] 

( “ “ b) 2 = c 22 (j 2 \ (21) 

( “ “ c ) 2 = c 33 <r 2 J 


Let / be a function of the parameters. Then 

Gj* “ (/oO’o ) 2 “f* ^(fafb^ab^a^b “f" fafc^ac^a^c) 

+ ( fb<?b ) 2 + tybfcTbcGbGc + (/c^c) 2 (By Eq. 7, p. 40) 
= (^[Cufa 2 + 2Ci 2 /o/6 + ZCizfafc + <>22 fb 2 

+ 2 C 2 zfbfc + C33/c 2 } (22) 

18 The theory of all this goes back to Gauss, Theoria Combinationis , Art. 21. 
An excellent reference is Whittaker and Robinson’s Calculus of Observations, 
Arts. 121-123. 



168 


CONDITIONS CONTAINING PARAMETERS [Sec. 63] 


As in Section 13 we write for the unbiased estimate of a 2 by 
external consistency, 

a 2 {ext) = where k — n — p 

rC 

n being the number of points and p the number of adjustable 
parameters. When a is not known from any better source, this 
estimate may have to suffice, and a 2 {ext) would replace a 2 in Eqs. 
21 and 22 , giving respectively 

(Est’d S. E. of a) 2 = c u a 2 {ext) ) 

( " “ “ 6) 2 = c 22 a 2 {ext) l (23) 

( “ “ “ c ) 2 = c 33 c r 2 {ext) ) 

and 

(Est’d S. E. of /) 2 = <r 2 {ext){c u fa 2 + 2 c l2 f a f b + 2 c l3 f a f c 

+ c 2 2 fb 2 + 2 c 23 /&/ c + c 33 / c 2 } (24) 

The student is urged to study Chapter V of R. A. Fisher’s 
Statistical Methods for Research Workers , wherein examples of 
the manipulation of the reciprocal matrix will be found. 

63. The error bands associated with a curve. Rejection of 
observations. When we write 

y =f(x; a, b, c) (25) 

and ask for the standard error of y, we are merely asking for the 
standard error of a function of a, b, and c, but not of x; conse¬ 
quently, we can apply Eqs. 22 or 24 at once, x enters merely as 
a constant. 

The distinction between Eqs. 22 and 24 is that, in the former, 
a is supposed to be known or approximated closely enough, under 
conditions of randomness, as from previous experience, or from 
internal consistency (Sec. 13), or from any other source that does 
not depend on the way the particular points in question fit the 
curve. In Eq. 24, on the other hand, a is estimated from the fit 
of the points, as was explained in Section 13. Eq. 22 , when applied 
at abscissa x along the fitted curve, gives the standard error of the 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


169 


curve for that particular abscissa. If this calculation is made for 
several abscissas, one may plot points along the standard error 
band. Error bands are plotted to show one or more standard 
errors above and below the fitted curve. In Fig. 22 (p. 228) the 
band shown is dbl.96 standard errors. This width of band would, 
on the average, leave 5 percent of the points outside the band if 
the coordinates were distributed normally about their true values 
with standard errors <j/Vw x for x and <r/Vw y for y. Sometimes 
the “ probable error ” band is drawn , 14 or a multiple thereof. 

When <t is estimated from the fit of the points and used in Eq. 24, 
one obtains a confidence band for the curve. A confidence band 
is different in principle from an error band only in using a {ext) in 
place of a presumably better known value of a. This, however, is 
often an important difference, because, although a itself is sup¬ 
posed to be constant, the external estimate of a will vary from one 
experiment to another, even in randomness, unless ft — p is as 
large as 20 or preferably 30. The width of the band may be 
adjusted to give various degrees of “ confidence.” This is done 
by using an integral of Student’s distribution, which is easy to do 
by looking up the corresponding value of t in Fisher’s tables 15 for 
ft — p degrees of freedom. Thus, to compute a 95 percent con¬ 
fidence band, one would look up ^95 in Fisher’s table, and then 
compute \ Y — y\ for several values of x , using the equation 

<95 = Est’d S. E. of y (26) 

14 A probable error, for normally distributed observations, is 0.67 times the 
standard error. For other distributions, some other factor is required, but 
calculations are ordinarily made with the factor 0.67 for which § is a close 
enough approximation. Birge uses the probable erro* band along with his 
curves. The following papers of his are recommended for their scientific 
insight, and for simple derivations of the standard error formulas: (a) Physical 
Review, vol. 40, 1932: pp. 207-261; ( b ) Amer. Physics Teacher , vol. 7, 1939: 
pp. 351-357. 

15 V. A. Nekrassoff s very handy nomograph may be used. It was pub¬ 
lished in Metron, vol. 8, No. 3, 1930, and is reproduced in W. A. Shewhart’s 
Economic Control of Quality (Van Nostrand, 1931), p. 490; see also Deming 
and Birge Statistical Theory of Errors (The Graduate School, Department of 
Agriculture, Washington, 1938), p. 136. 



170 


CONDITIONS CONTAINING PARAMETERS [Sec. 63] 


The distance | Y — y\ laid off above and below the curve defines 
points on the confidence band, and the result will have the appear¬ 
ance of Fig. 20. (The capital Y used here is not to be confused 
with the same letter used in Fig. 17 and elsewhere for an observed 
coordinate.) 


Y 



Fig. 20. A fitted curve and the corresponding confidence band. An error 
band is laid off in like manner as a multiple of the standard error of the func¬ 
tion, and has a similar appearance. 

Remark 1 . A convenient reference showing the application of 
Eq. 24 to curve fitting is a paper by Henry Schultz, J. Amer. 

Stat. Assoc., vol. 25, 1930: pp. 139-185. Schultz shows curves 
and confidence bands of width twice the standard error of the 
curve for several kinds of curves. It should be mentioned, as 
Schultz does, that all these things were well known to Gauss and 
others in his time, but that they did not take the trouble to write 
out the formulas explicitly and draw the graphs for all the things 
that interest us today. 

Remark 2. It must be remembered that, even in a state of 
randomness, a new set of points (i.e., a new experiment) will give 
a new curve and a new set of parameters; hence, curve and error 
band, will be shifted to a new position by a new experiment. 
Moreover, since the external estimate of <r will fluctuate from 
one set of data to another, then not only will the curve and con¬ 
fidence band be shifted to a new position by a new set of data, 



[Ch. IX] COMPUTATION FOR FITTING CURVES 


171 


but the width of the band itself will also be different. This is 
one of the reasons why a single experiment, without considera¬ 
tion of other knowledge, is not a basis for action, particularly if 
the consequences of the wrong action are hazardous (cf. Ch. I). 

Confidence intervals for any other function of a, 6 , and c are 
made up in like manner, and similar remarks apply. 

The purpose in drawing an error band or confidence band is to 
invoke statistical aid in detecting spurious conditions in the data, 
or, more precisely, in the experimental conditions that gave rise to 
the data. A point that lies outside an error band of width two or 
three standard errors should be investigated; but it is to be dis¬ 
carded, and the curve refitted, only if investigation discloses 
anomalous experimental conditions at that point. Whether one 
uses a band of width two standard errors or three standard errors 
is a matter that can be decided only by personal preference and 
experience in a particular line of work. The wider the band, the 
fewer the points outside it, and on this criterion the less likely one 
is to look for experimental difficulties. On the other hand, if the 
band is too narrow, one will look for experimental difficulties too 
often — that is, he will be looking for trouble too often when there 
is no trouble . 16 Many papers and chapters have been written on 
the statistical rejection of observations, but the best practice seems 
to be contained in the statements just given. In summary, a 
point is never to be excluded on statistical grounds alone. 17 

18 These thoughts follow the reasoning expounded by Shewhart in 1924 
when he introduced the control chart. The student of modern statistical 
theory will recognize in them the arguments inherent in errors of the first and 
second kinds. 

17 R. A. Fisher, “ On the mathematical foundations of theoretical statistics," 
Phil. Trans. Royal Soc. t vol. 222A, 1922: p. 322 in particular. 



Part E 

EXERCISES AND NOTES 


CHAPTER X 

EXERCISES ON FITTING VARIOUS FUNCTIONS 

64. Purpose of the chapter. The exercises and notes in this 
chapter will serve two purposes: first, to provide practice in 
forming the normal equations for various functions commonly 
met in practice; second , to provide a compendium of results, 
handy for reference. Once these exercises are mastered, other 
functions that arise in practice should present little or no difficulty. 

A special note should be made concerning the fitting of poly¬ 
nomials such as 

y = a + bx, y — a + bx -f cx 2 , etc. 

When x is free of error and uniformly spaced, certain short-cuts, 
eminently worth while learning if the problem is to occur fre¬ 
quently, are provided by the use of orthogonal functions. Since 
good references are accessible, the subject need not be treated 
here. The methods shown in the following exercises will work 
under very general conditions. But if a polynomial is to be 
fitted again and again when x is free of error and equally spaced, 
the reader is advised to learn the method of orthogonal func¬ 
tions. The theory is complicated, but the application is not. 

The following list of references will suffice for clear descriptions 
of several different procedures: 

1. R. A. Fisher, Statistical Methods for Research Workers 
(Oliver & Boyd); sections 28, 28 1, and 29.2 in the 6th and 
later editions. Fisher’s procedure and his description thereof 
have justifiably found great favor. 

2. R. A. Fisher and Frank Yates, Statistical Tables for Bio¬ 
logical, Agricultural , and Medical Research (Oliver & Boyd, 
1938). An extension of these tables has recently been pub- 

172 



[Ch. X] EXERCISES ON FITTING FUNCTIONS 


173 


lished by R. L. Anderson and E. E. Houseman, “ Tables of 
orthogonal polynomial values extended to n = 104 ” (Ames, 
Research Bulletin 297, 1942). 

3. Raymond T. Birge and John D. Shea, “ A rapid method 
of calculating the least squares solution of a polynomial of 
any degree” ( University of California Publications in Mathe¬ 
matics , vol. 2, No. 5, 1927; now unfortunately out of print). 
This procedure is rapid and possesses great merit. Up to a 
certain stage it seems to be equivalent to Harold T. Davis’ 
method, but beyond that stage the remaining work is simpler 
than Davis’, and requires fewer decimals. 

4. A. C. Aitken, Proc. Royal Soc. (Edinburgh), vol. 53, 1932- 
1933: pp. 54-78. 

5. Max Sasuly, Trend Analysis of Statistics (The Brook¬ 
ings Institution, Washington, 1934). 


65. The line 1 

In the exercises that follow, the symbols [x], [xx], [xy], [xF 0 ], 
and the like, refer to summations formed with the observed co¬ 
ordinates. Moreover, x and y refer to the mean values of the 
observed coordinates. The distinction made in Chapters IV, 

VII, and IX — capital letters for observed coordinates, and 
small letters for the adjusted coordinates — can now be dropped. 

In this chapter and the next, it will be convenient to use capital 
letters to denote logarithms ( Y for log y; etc.). When there 
seems to be special need of distinguishing observed from calcu¬ 
lated coordinates, the subscript obs or calc will be affixed. In 
the numerical evaluation of the derivatives, and of W or L 
(Tables 1 and 2 of Sec. 60), if x and y arc called for, their ob¬ 
served values are to be inserted, along with the approximate 
determinations a 0 , 6 o, c o for the parameters. 

Exercise 1 . (a) Given the line 

y = a + bx 

to be fitted to n points, x free of error, all y coordinates of equal 
weight (unity). Here we take 

F = y - (a + bx) 

1 The line y = bx, forced to pass through the origin (i.e., with o = 0), 
was discussed to some extent in Section 15. 



174 EXERCISES AND NOTES [Sec. 65] 

The derivatives are 

F a = -1, F b = -x 
F x = —b (not needed here), F v » 1 
L = 1 (Why? See Eq. 8, p. 134.) 

With the approximate values oo and bo we compute 

F 0 = y 0 b « - (o 0 + box) 

at every point. Since L = 1 at every point, Tables 1 and 2 of 
Section 60 coalesce, and the normal equations are seen to be these: 


Row A 

B = 

1 

Cl 

C2 

Sum 

1 n 

2 

3 

[x] 

[xx] 

-Ifo] 

-IxFo) 

[FoFol 

1 

0 

0 

0 

1 

0 

(Set 1, Exer¬ 
cise 1) 


(6) The solution for A and B } found by the routine of Section 61 
or any other method of solution, is 



n 


g __ __ [ X F q] + x[F p] 

ny 2 

where 

ny 2 = [xx] — rix 2 

ny 2 is the second moment of the x coordinates about an axis paral¬ 
lel to Oy and passing through the centroid x, y. 

(c) The adjusted values of a and b turn out to be 

a = aQ — A — y — bx 

b = 6 0 - b = £ ~ ~ y) = M - n%y 

ny 2 ny 2 


The fitted line therefore passes through the centroid 2, y . But 
note that when there is error in both x and y coordinates at some 
or all of the observed points, the weights being such that w x /w v is 



[Ch. X] EXERCISES ON FITTING FUNCTIONS 


175 


not constant throughout, the line does not pass through the cen¬ 
troid (see Remark 2 in Exercise 4). 

(d) The solution just found for a and b is the same as would have 
been found from the normal equations shown below as Set 2, in 
which the unknowns are the full values of a and b. In this prob¬ 
lem it is therefore permissible to take ao and b 0 both as zero, where¬ 
upon F 0 is simply y 0 b 


Row a 

b - 

1 

Ci 

c 2 

Sum 

1 n 

2 

3 

[xx] 

[y] 

[zy] 

[yy] 

l 

0 

0 

0 

1 

0 

(Set 2, 
Exercise 1) 


Note that a calculation of F 0 is required at every point in 
forming the normal equations of Set 1, but not for the formation 
of Set 2, because in the latter, F 0 is the same as However, 
in Set 1 it is only the residuals A and B that are to be solved 
for, the main part of the adjustment having already been 
allowed for in fixing the approximate values a 0 and 6 0 . It is 
different in Set 2; there the unknowns are the full values of a 
and b, requiring the computer to carry more figures. These 
additional figures usually more than offset the time required 
for computing F 0 . It therefore is usually advisable to find 
good approximations and use Set 1. The better the approxi¬ 
mations, the fewer figures required. Birge and Shea make 
use of this principle in their method of fitting polynomials 
(mentioned in the preceding section). 

(e) When the solution of either Set 1 or 2 is carried out accord¬ 
ing to the scheme of calculation exhibited on page 158, the extreme 
left entry in Row III will be the minimized sum of squares <S», or 
21 (yobs — Vcaic) 2 - The sum of squares removed from [FoF 0 ] in 
Set 1, and from [yy] in Set 2, by the successive adjustments of a 
and 6, appear in the extreme left entries of Rows 5 and 6. Show 
that 

5 = [yy] ~ ny 2 - nn 2 b 2 

the last term being the sum of squares removed by allowing the 
line to have slope b instead of slope 0 — in other words, the sum of 
squares removed by regression. The two terms ny 2 and n^b 2 
appear in Rows 5 and 6 of the solution of Set 2. 



176 


EXERCISES AND NOTES 


[Sec. 66] 


(/) If V denotes y 0 ba — Vcaie at any point, the solution for a 
and b renders £ V = 0. (But note that neither £ V nor £ wV 
is necessarily zero in least squares solutions; it only happens to be 
so here. In fact, in Sec. 15a we saw a simple example wherein 
neither £ V nor £ wV was zero. See Remark 4 in Exercise 4; 
see also Exercise 5.) 

Exercise 2. (a) The reciprocal matrix for the normal equations 

in the preceding exercise appears in the C\ and C 2 columns of 
Rows 7 and 8 (these numbers refer to the solution of the equations 
solved according to the form on p. 158). It turns out to be 

1 X 2 X 

n njjL 2 n ^2 

A” 1 = 

_ Jc_ _1_ 

nH2 n M 

(i b ) From the upper left and lower right elements of this array 
we may say that 

n 

The weight of a = -- _ 2 - - 

1 + 2 

The weight of b = n /u 2 

Thus, if the experimental conditions were random, our confidence 
in b would increase as the 4 ‘ spread ” of the points increases. Is 
this reasonable? Why does the weight of a depend on x? 

(c) (S. E. of a) 2 = (? 2 (l/n + x 2 /n» 2 ) 

(S. E. of b) 2 = cr 2 /nfi 2 

Note that the weights and standard errors of a and b do not involve 
the y coordinates of the points. Compare, with part d of the 
next exercise. Note also that the denominator of the last fraction 
is equal to A, since 

n 2 ix 2 = 

hence near indeterminacy (a small value of A) is closely associated 
with large standard errors of a and b, and a rapid “ fanning out ” 


n [a;] 
[z] [xx] 



[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


177 


of the standard error of y ca ic each side of x, y (see the next part; 
also Exercise 2a of Sec. 61). 

( d ) From Eq. 8, page 40, and the reciprocal matrix of part (a) 
of this exercise, prove that the 

o -2 f (x — x) 2 

(S. E. of y ca ic) = —| 1 H-- 

n[ M 2 


Thus the standard error of y ca ic is least at the center of gravity 
• (x, y) of the points, and fans out each side of it. (See Sec. 63 and 
the reference to Henry Schultz; also Figs. 20 and 22.) 

(e) The standard error of the calculated line of Exercise 1 at 
the center of gravity is <r/\/n, as it would be for n observations 
made on a single unknown. (Do this in two ways: 1° put x — x 
in part d ; 2° put x = 0 in part c for the standard error of a.) 

Exercise 3. (a) Carry out the solution of the normal equations 

of Exercise la in symbols, following the outline given in Section 61, 
and show that the minimized value of S or of £ (yobs — ycaic) 2 
comes in the “1” column of Row III (which will be the extreme 
left entry in III). The same is true if the approximations ao and 
b 0 are used, as is advised in Exercise Id. 

(6) Show that the minimized sum of squares in this problem 
can be written 


£ res 2 = n(l — r 2 )s v 2 


where 


r 


^ (x - x) (y - y) 

riSx$y 


and 


nsy 2 = [yy] - ny 2 
ns x 2 = [xx] — nx 2 


the correlation coefficient 


(s* 2 is here used in place of ^2 for consistency with s v .) 
(c) The estimate of <r made from the fit of the line is 


a 2 (ext) = 


n( 1 — r 2 )s v 2 


n — 2 


(See Sec. 13) 



178 


EXERCISES AND NOTES 


[Sbc. 66] 


(d) The 

(Est’d S. E. of a) 2 = —^ s u 2 (l + ^ 
n — l \ s x / 

i m2 2 

(Est’d S. E. of b) 2 - = —• ^ 

Tl ** S x 

Note that the estimated standard errors of a and b involve the y 
coordinates; compare with part (c) of the preceding exercise. 

(e) The 

(Est’d S. E. Of Healed = —£ S 2 j 1 + ( -- ~ 2 - " - 2 j 

71 — Z l Sjf J 

Exercise 4. (a) If both x and y coordinates are subject to error 

with varying precisions at some or all of the n points, one must 
perform the calculations called for in Tables 1 and 2 of Section 60. 
For the line 

y — a + bx 

one may take 

F = y — a — bx 

Some good approximate values a 0 and b 0 having been found, one 
can then calculate the numerical value of 

F 0 = y 0 bs - (do + b 0 x ob8 ) 
at each of the n points. The derivatives of F are 

F x — —b, F v = 1 , F„=- 1 , F b = -x 
whereupon 



W X Wy 


L varies from point to point with w x or w v . 

The headings for Table 1 of Section 60 would be these: 


h, or Point No. w z w y L V L F b *» —x Fq 

(It is not necessary to tabulate F X} F V) and F a since they remain 
constant from point to point.) 





(Ch. X) EXERCISES ON FITTING FUNCTIONS 179 

The headings for Table 2 of Section 60 would be as shown below. 


h, or Point No. F a /^L= —1 /tJL Fb/^L= —x/^L Fq/^L Sum 


It has already been remarked (Remark 3, Sec. 54) that there 
is some theoretical advantage in writing W in place of 1/L, 
though it is a fact that with machines having automatic division 
and two dials for quotients — one for the individual quotients 
needed for Table 2, and another for cumulating the quotients 
across the rows for the “ Sum ” column of Table 2 — there may 
be a practical advantage in tabulating V L rather than V W hi 
Table 1, and using divisions by V L, rather than multiplications 
by V W y to form Table 2. 


Writing now 


W 



W x U)y 


for L 


we see that W = w v if x is free of error or if b = 0 (see the next 
exercise), and W = w x /b 2 if y is free of error, but that both terms 
are required if x and y are both subject to varying errors, and if the 
line is inclined so that b is not small (see Exercise 86). 

In terms of W the headings of Table 2 might be these: 


h, or Point No. 


—W 

■JW-Fo 

Sum 


(6) The normal equations are formed in the usual way by sum¬ 
ming squares and cross-products from Table 2. They can be 
symbolized as shown in Set 1. 


Row 

A 

B 

= 1 

Cl 

Ci Sum 


I 

[W] 

[Wx] 

~[WFo) 

1 

0 

(Set 1, 
Exercise 4) 

2 


[Wxx] 

-[WxFo] 

0 

1 

3 



[WFqFq] 

0 

0 


The systematic solution of the normal equations (shown on 
p. 158) gives A and B from the “ 1 ” column, and the reciprocal 
matrix A” 1 as usual from the C\ and C 2 columns. The adjusted 
values of a and 6 will be 


a = Oo — A 
b = 6 0 - B 



180 


EXERCISES AND NOTES 


[Sec. 65] 


The systematic solution gives the minimized value of £ ( w x V x 2 + 
WyV y 2 ) in Row III, column “ 1." That portion of the sum of the 
weighted squares subtracted from [WF 0 F 0 \ by shifting the first 
parameter from oq to a appears in Row 5, and the portion further 
removed by shifting the second parameter from b 0 to b appears in 
Row 6 (see Exercise 3 of Sec. 61; also Exercise 1 of this section). 

Note that ao and bo may be taken as 0 (with the necessary 
increase in the number of decimals required in the normal equa¬ 
tions), so far as F 0 is concerned, in which event F 0 becomes simply 

Fq ~ y 0 b 8 


The normal equations are then symbolized like those following, 


Row a 

b 

1 

Ci 

C 2 Sum 


I [W] 

[Wx] 

+[Wy] 

1 

0 

(Set 2, 
Exercise 4) 

2 

3 

[Wxx] 

+{Wxy\ 

[Wyy] 

0 

0 

1 

0 


and the solution gives a and b directly. Why are more decimals 
required in these equations than in the preceding ones giving the 
(supposedly small) residuals A and 

But note carefully that an approximate value of b must be used 
in the calculation of W at each point where x is subject to error. 
bo may be called 0 in the calculation of F 0 (as noted above) but 
not in the calculation of W. If this admonition is disregarded, the 
effect of the weighting of x is lost. In fact, if it turns out that the 
approximate value of 6 used for calculating W was too far removed 
from the final 6, it may be desirable to make a second adjustment to 
secure improved weightings W , which can be obtained by using the 
value of b from the first adjustment; but this is seldom found 
necessary in practice. 2 

2 As Gauss put it, in a somewhat different problem: “ Quodsi dein calculo 
absoluto contra exspectationem valores incognitarum p', q', r', s', etc., tanti 
emergerent, ut parum tutum videatur, quadrata productaque neglexisse, 
eiusdem operationis repetitio (acceptis loco ipsarum t, x, p, <r, etc., valoribus 
correctis ipsarum p, q, r, s, etc.) remedium promtum afferet.” Theoria Motrn 
Corporum Coelestium (Hamburg, 1809), Art. 180. 



[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


181 


Remark 1 . Of course, it may happen in some particular prob¬ 
lem that b actually is very small, and that 0 is therefore a good 
approximation for b. In such circumstances the line is practi¬ 
cally horizontal, the weighting of x does not matter much, and 
the computer may as well simplify matters and set W = w v , 
ignoring the weighting of x — not because x is free of error 
(i.e., not because w x is infinite), but because b is zero or nearly 
so. The student should ponder over the situation where b is 
actually known to be 0; do the values of x count at all in the 
solution? Does this not take us back to the simplest problem 
in curve fitting, seen in Section 10? The solution obtained 
there can be translated to the needs of the present circum¬ 
stances by interchanging x and y, and rewriting Eq. 10 on 
page 19 to get 

Z wy 


then rewriting Eqs. 12 and 12' on page 21 to get 

w a = £w 

and the 


(S. E. of a) 2 = 


£ w 


In all these equations, w now denotes the weight of y, not x. 

Remark 2. When x and y are both subject to error at some 
or all of the observed points, the line does not pass through the 
center of gravity 

- [XW X ] _ [ijWy] 

x = -- y = --- 

[w x ] [Wy] 


But the line will in any case pass through a quasi center defined 
as 


, _ \m , [yW] 
X [W] 9 y [W] 


Remark 8. With l/W written in place of L, Eq. 8, page 134, 
gives 

1 _ + EiE* (Cf. Remark 3, p. 135.) 

W W x Wy 

As has already been seen, the first term drops out if x is free of 
error, and the second term drops out if y is free of error. To 
make the change from a solution in which x is free of error to 
one wherein both coordinates are subject to error, we merely 



182 


EXERCISES AND NOTES 


65] 


add the other term in 1 /W and recalculate W at every point, 
the procedure being otherwise the same. There is a close 
analogy with celestial mechanics; when one wishes to com¬ 
pute the orbit of a body of mass m about another of mass M, 
he may at first make the simplifying assumption that M is 
infinite (i.e., immovable), and solve the equations, later replac¬ 
ing m by n where 



This replacement yields the absolute motion of the two bodies, 
neither being of infinite mass (i.e., neither one immovable). 

Remark 4* When x and y are both subject to error at some 
or all of the points, we can not always assert that 

£ V x = 0, or £ V v = 0, or £ (w*F* + w y V v ) = 0 

though these may sometimes happen, as in Exercises 1 and 5, 
q.v. We have already seen a simple example in Section 15, 
where these summations were not zero. There is, however, a • 
property of least squares by which one can always assert that 
after the adjustment, 3 

L (W X U X V X + WyXJyVy) - 0 

(For definitions of U x , etc., see Figs. 16 and 17 on pp. 132 and 
133.) 

Remark 5. It is interesting to note that in the routine solu¬ 
tion of Set 2, the minimized S appears in the extreme left entry 
of Row III, but that, in contrast with Set 1, unless the final 
value of b is actually or very near 0, the entry in Row 6 directly 
above S will not show the increment in S that would result from 
fixing b at either the value 0 or 6 0 . The reason is that a good 
value of b must be used in W at each point where x is subject to 
error: if we want to know what the solution would have been 
with b - 0, we must actually make a solution with b set equal 
to 0 in the computation of W, in which circumstance W reduces 
to w v , as already noted. 

Exercise 5 . Given 

y = a + bx 

to be fitted to n points, x free of error, the y coordinates each hav¬ 
ing weight w y , varying from point to point. This is similar to 

s Published by the author in the Phil. Mag. } vol. 19, 1935: pp. 389-402. 



ICh. X] 


EXERCISES ON FITTING FUNCTIONS 


183 


Exercise 1 except that now the y coordinates have unequal pre¬ 
cisions. Here we take 

F = y — a — bx 

as in the preceding exercise, the derivatives being also the same. 
But since x is free of error, w x is infinite, and it follows that W 
(defined as 1/L) is none other than w y . All we have to do is 
replace W in the preceding exercise by w V) and the results will 
apply here. The headings of Table 1 of Section 60 will be these: 

h, or Point No. w v V w v Fb - —x Fo 

For Table 2 they will be these: 

h, or Point No. V w y Fb/->J L = — xyj w v V w v m F o Sum 

The normal equations are written in the same symbols as those of 
the preceding exercise, but with w y in place of W. Row III in the 
systematic solution of the normal equations (p. 158) gives the 
minimized value of 2 w v(yob 8 ~ Vcaic) 2 - In Rows 5 and 6 are 
found the portions of the weighted squares removed by a and b, as 
in Exercise le (p. 175). 

Note that, as in the preceding exercise, it is permissible to take 
a 0 and 6 0 as zeros, if the number of decimals in the normal equa¬ 
tions is increased accordingly. In this event, 

Fq — yobs 


and the normal equations may be written 


Row 

a 

b 

= 1 

Ci 

c 2 

Sum 

I 

M 

[wx] 

[wy] 

1 

0 


2 


\wxx] 

[wxy] 

0 

1 


3 



[wyy] 

0 

0 



giving a and b directly. Row III in the systematic solution will 
give the minimized value of £ w v (y 0 b 8 ~ y C aic) 2 in the “ 1 ” 
column, and Rows 5 and 6 will show the sum of squares removed 
successively by a and b, as in Exercise le. Since w x is infinite, the 



184 


EXERCISES AND NOTES 


[Sec. 65] 


question of an approximate value of b for use in the calculation of 
W does not come up. 

Remark. For the conditions stated (x free of error) the 
sum of the weighted y residuals is zero, i.e., 

I>F = 0 

See Remark 4 of Exercise 4, page 182. 

Exercise 6. (a) Given the line 

y = a + bx 

to be fitted to n points, both x and y coordinates subject to error 
but in such a way that w x /w y is constant and not infinite nor zero, 
the line passes through the center of gravity x = [w x x)/[w x ], 
y = [Wy2/]/[wj/L with Slope 

_ c\wv 2 ] — [wu 2 ] + Vj c[wv 2 ] — [wu 2 ]} 2 + 4:c[wuv] 2 
2 c[wuv\ 

This is equivalent to a result obtained by Kummell in 1876, Karl 
Pearson in 1901, and Gini in 1921. Here u and v are the x and y 
coordinates of a point, measured from the center of gravity x, y; i.e., 

Ui = X{ — x, and V{ = yi — y. c is written for w v /w x , and w in 
place of w X} for convenience. 

(6) If the plus sign be changed to minus in front of the radical, 
the result is the slope of the worst fitting line, that which maximizes 
the value of 2) (w*F x 2 + w y V y 2 ). 

(c) Prove that under these conditions of weighting, the best and 
worst fitting lines are perpendicular to each other. 

Exercise 7. (a) Given 

y = a + bx 

to be fitted to n points when y is free of error and all x coordinates 
are of equal weight (unity), we may write 

x = y + qy 

and find the following normal equations for y and q. These are 



[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


185 


Row 

V 

9 

= 1 

Ci 

c 2 

Sum 

I 

n 

[y] 

lx) 

1 

0 


2 



[yx] 

0 

1 


3 



[xx] 

0 

0 



like Set 2 of Exercise 1 with x and y interchanged. 

Row III in the solution of the normal equations gives £ res 2 
where now the deviations are measured parallel to the x axis. 

(6) The reciprocal matrix A -1 found in the C\ and C 2 columns 
of Rows 7 and 8 of the solution will be 

1 j y 2 _ y_ 

, n ns y 2 ns v 2 

A 1 = _ 

_ y_ 1 

TISy rtSy 

where s y 2 has the same significance as in Exercise 3. 

(c) (The S. E. of x calc ) 2 = - (1 + (y 

n I. Sy 

(( d ) The normal equations of Exercise 7 a give I = 0. (See 

the remarks in Exercises 1/, 4, and 5.) 

Exercise 8. (a) Prove that with y free of error, and all x coordi¬ 

nates of equal precision, the normal equations for a and b (or for 
A and B) in Exercise 4 will give the same line as the normal equa¬ 
tions in Exercise la (i.e., will give p = —a/b and q = 1/6), except 
for the effect of the neglect of the squares of the x residuals. The 
solution of Exercise la is the more accurate in not throwing away 
any higher powers of residuals. This may occasionally be impor¬ 
tant. (See also Exercises 18 and 23.) 

(6) Show that if x has the same weight (i.e., the same precision) 
over all n points, and y likewise, x and y both subject to error, the 
line that one gets by the exact solution given in Exercise 6a lies 
between the two false lines that one gets by i. throwing the 
adjustment all on to t/, using the equations of Exercise 1; and ii. 
throwing the adjustment all on to x, using the equations of Exer- 




186 


EXERCISES AND NOTES 


[Sec. 65] 


cise 7a; but that these two false lines differ only in the effect of the 
squares of the x residuals and of the y residuals, respectively. 
{Hint: Both terms of l/W in Exercise 4 are constant over all 
points when w x and w y are constant; hence, so far as the values of a 
and h or p and q are concerned, W can be put equal to unity at every 
point in all three solutions — in the correct solution, and in the two 
false solutions. The normal equations of Exercise 4 will then give 
identical results for all three. But the normal equations of Exer¬ 
cise 4 can be in error at most by the neglect of higher powers of the 
residuals, hence the false solutions i. and ii. can differ from the 
true solution only through the neglect of such terms. This means 
that when x has the same weight over all points, and y likewise, 
the false solutions will hardly be distinguishable from the true 
solution if the residuals are all fairly small. 4 ) 

Remark 1. If W is constant from one point to another, it is 
advisable for convenience of computation to choose the system 
of weighting so that W — 1 at all points. This is only saying 
that the arbitrary factor o- 2 in Eq. 13 of Section 11 is to be chosen 
so that W = 1. Then S in the extreme left entry of Row III 
in the solution of the normal equations comes out in the same 
system, and a 2 {ext) = S/{n — 2) is the external estimate of a 2 
in the same units as were arbitrarily chosen for it. 

(c) All three lines of part (6) pass through the center of gravity 
x , y (called also the centroid). 

Remark 2. Statements similar to those of part (6) will hold 
for any curve when the combination of the form of the function 
and the weighting of the coordinates causes both terms of 1 /W 
to be constant over all n points. Example 3 of the next chapter 
is an illustration in three dimensions (three terms in 1/IF). 

Exercise 9. For the line y = a + hx fitted to n points, the 
following expressions hold (all due to Karl Pearson, Phil. Mag., 
vol. 2, 1901: pp. 559-572). 

(a) £ res 2 = n(l — r 2 )s v 2 

x free of error, the y coordinates all of equal weight (unity); 
the deviations measured in the vertical. (This result was given 
in Exercise 36.) 

4 This fact was noted by the author without proof in the Proc . Physical Soc. 
(London), vol. 47, 1935: p. 107. 



[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


187 


(b) £ res 2 = n( 1 — r 2 )s x 2 

y free of error, the x coordinates all of equal weight (unity); the 
deviations measured in the horizontal. 

(c) 2 res 2 = \n\s x 2 + s y 2 - V(, s x 2 — s y 2 ) 2 + 4 r 2 s x 2 s v 2 \ 

The x and y coordinates of equal weight (unity), the deviations 
measured perpendicular to the fitted line. 

In these formulas, s x 2 , s v 2 , and r have the meaning ascribed to them 
in Exercise 3, page 177. 


66. The parabola 

Exercise 10. Given 

y = a + bx + cx 2 

to be fitted to n points, x and y having weights w x and w v at any 
point. Here we take 

F = y — (a + bx + cx 2 ) 

Fo = yoba (®0 + b()Xoba " f “ CQX 0 bs ) 

The derivatives of F are 


F x = - (b + 2cx), = 1 

Fa = -1, F 6 = -x, F c = —x 2 


1 


(b + 2cx) 2 | 1 


IT 

The headings of Table 1 in Section 60 will be these: 


h y or Point No. F x = — (6 -f- 2 cx) w x w y L y/L Ft F c Fq 


It is understood that in calculating all quantities under these 
headings, x and y are to be replaced by their observed values, and 
a, b, Cy by a 0 , &o, Co (cf. the note at the beginning of Sec. 65, p. 173). 
It is not necessary to tabulate F v and F a because they remain con¬ 
stant from point to point. 



188 


EXERCISES AND NOTES 


[Sec. 66] 


The headings of Table 2 will be as shown below, 

F a />1L fJTl FcHL FoHL ~ 
Point No. « - V W « - V W-x » - V W-x 2 = V W-F 0 Dum 

The usual process of cumulating sums of squares and cross- 
products in Table 2 yields the following normal equations. 

Row A B C - 1 Ci C 2 Cz Sum 

1 [W] [Wx] [Wx 2 ] -[Wo] 1 0 0 ... 

2 [Wx 2 ] [Wx 2 ] ~[WxF q ] 0 1 0 ... (Set 1, 

3 [Wx 4 ] -[Wx 2 F 0 \ 0 0 1 ... Exercise 10) 

4 MWFqFq] 0 0 0 ... 

The solution, carried out by the usual routine procedure, gives 
A, Bj and C, whence the adjusted values of a, b , and c are 

a — do — A 
b = b 0 - B 
c = Co - C 

The minimized value of S or £ (w x V x 2 + w y V y 2 ) will appear in 
Row IV, column “ 1.” This will be simply £ w z 7 x 2 if y is free of 
error, and 2 w v V y 2 if x is free of error. Directly above, in Row 8, 
appears the sum of squares that is removed from [WF 0 F 0 ] by shift¬ 
ing the y intercept from a 0 to a; in Row 9 appears the further 
decrease brought about by allowing the second parameter to shift 
from 6 0 to 6; and in Row 10, just above S, appears the portion of 
the sum of squares that is finally removed by adjusting the para¬ 
bolic term from cqz 2 to cx 2 (see Exercise 3 of Sec. 61). 

The reciprocal matrix A"" 1 will appear in the C 1 , C 2 , C 3 columns 
of Rows 11, 12, and 13 (the “ back solution ”), containing the 
variance and product-variance coefficients for a, 6, and c. (See 
Exercise 12 for the matrix A”" 1 in a special case.) 

Note the similarity between Set 1 of Exercise 4 (p. 179), and 
Set 1 of Exercise 10. Note also that if a; is free of error, ao, and 
Co may be taken as 0, so far as Fq is concerned, in which event Fq 
becomes simply 

Fq = y 0 bs 








[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


189 


The normal equations then give a, b, and c directly, and would 
appear as shown below. 


Row a 

b 

c = 

1 

Ci 

c 2 

Cz Sum 


I [W] 

[Wx] 

{Wx 2 ] 

[Wy] 

1 

0 

0 


2 

lWx 2 ] 

[TFz 3 ] 

[Wxy] 

0 

1 

0 

(Set 2, 

3 


{Wx*] 

{ Wxhy ] 

0 

0 

1 

Exercise 10) 

4 



[Wyy] 

0 

0 

0 



More decimals will be required here than if good values of do, 
b 0 , and c 0 had been used in the calculation of F 0 , and the previous 
normal equations (Set 1 of this exercise) had been used to find A, 
B, and C. Why? 

The reciprocal matrix is the same, in both sets, and the min¬ 
imized value of £ (w x V x 2 -b w v V v 2 ) again comes in the ex¬ 
treme left entry of Row IV; but, as in Remark 5 of Exercise 4, 
the entries directly above it in Rows 10 and 9 do not show the 
increments in the sum of the weighted squares that would result 
from setting c = 0 and b = c = 0, respectively, unless b and c 
are very small, or x free of error. 

Note the similarity between Set 2 of Exercise 4, and Set 2 of 
Exercise 10. The remarks at the end of Exercise 4 apply here 
with obvious modifications. For example, approximate values 
of b and c must be used in the calculation of W at each point 
where x is subject to error. 

Exercise 11. Given 6 

y — a + bx + cx 2 


to be fitted to n points, x free of error, all y coordinates of equal 
weight (unity). If a 0 , bo, and c 0 all be taken as 0, the normal 
equations giving a, b, and c directly are 


Row a 

b 

c = 

1 

Ci 

c 2 

C 3 

Sum 

I n 

[x] 

[**] 

l y) 

l 

0 

0 


2 

[z 2 l 

[X 3 ] 

by] 

0 

l 

0 


3 


lx 4 ] 

[x 2 y] 

0 

0 

1 


4 



[yy] 

0 

0 

0 



5 See the reduced type at the beginning of this chapter for references to 
special methods involving orthogonal polynomials, applying to problems 
wherein x is equally spaced. 



190 


EXERCISES AND NOTES 


[Sec. 66] 


These arise immediately from the last set of normal equations of the 
preceding exercise by noting that under the conditions w x = co 
and w y = 1 throughout, W = 1 throughout. Row IV in the solu¬ 
tion of the normal equations will contain the minimized value of 
(2 fobs — Vcaic) 2 in the “ 1 ” column. The sum of squares suc¬ 
cessively removed by the constant, linear, and parabolic terms will 
appear in the “ 1 ” column of Rows 8, 9, and 10 (see Exercise 3 of 
Sec. 61; also Exercises 1, 4, 5, and 10 of this chapter). 

Note the similarity between these normal equations and those 
of Exercise 1, Set 2 (p. 175). 


Exercise 12. (a) In the preceding exercise, let the origin of x be 

taken at the mean value of x, and let 71/4 2 be written for [x 2 ] and 
nn 4 for [x 4 ]. Then the normal equations are 


Row 

a 

b 

c = 

1 

Cl 

C 2 

c 8 

I 

n 

0 

nn 2 

[y] 

1 

0 

0 

2 


nn 2 

0 

M 

0 

1 

0 

3 



nyx 4 

[x 2 y] 

0 

0 

1 

4 




[yy] 

0 

0 

0 


Show that the reciprocal matrix is 


A " 1 


M4 

o 

7i/4 4 — W/X 2 


0 


0 


1 

n/x 2 


M2 

2 

nn 4 — n/z 2 
0 


M2 

7l/4 4 — 71/4 2 2 0 


1 


71 /44 — 7l/4 2 2 


From this matrix, one can write down the standard error of the 
fitted curve, or of any function of a , b, c, in terms of a (see Sec. 62; 
also Exercise 2 of this section). In particular, the 


(S. E. of a) 2 


n(l - /4 2 2 //4 4 ) 

a 2 

(S. E. of 6) 2 = — (Compare with Exercise 2, p. 176.) 

7l/4 2 



[Ch. X] EXERCISES ON FITTING FUNCTIONS 191 

The (S. E. of y C aic) 2 at the center of gravity is equal to 
<r 2 /n( 1 — /* 2 2 /m 4 )i which exceeds the value <r 2 /n found in Exercise 2 
for the line. 

(b) Show that the determinant A of the coefficients is equal to 
n 3 /*2(M4 M 2 2 ); hence that near indeterminacy (small A) will 

result not only in instability but also in high standard errors for a, 
b f and c, and rapid fanning out of the standard error of y ca i c (see 
Exercise 2c, p. 176; also Exercise 2a of Sec. 61, p. 162). 

67. The exponential and its logarithmic form 

Exercise IS. Given the equation 
y = ae bx 

to be fitted to n points, x free of error, all y coordinates of equal 
precision (unit weight). Here we take 

F — y — ae bx 

Good approximate values of a and b can usually be found by 
plotting log y against x. Assuming that they can be obtained, we 
write 

F q = Y — a 0 e boX ( Y denotes an observed y as in Fig. 17, p. 133.) 
The derivatives of F are 

F y = 1, F a = — -, Fb = —xy 

a 

W — 1 at all points; hence Tables 1 and 2 of Section 60 coalesce. 
They will be made up as follows. For convenience in writing, the 
subscript 0 will be withheld from the a and 6. Xi, Y\, etc., are 
the observed x and y coordinates of the n points. 

Tables 1 and 2 


h 

F a 

F b 

F 0 

Sum 

1 

— Yi/a 

-XiYi 

Yi - <u? x ' 


2 

-Yt/a 

-X 2 Y 2 

Y t - ae bx ' 


etc. 







192 EXERCISES AND NOTES [Sec. 67] 

The normal equations are formed from this table by summing 
squares and cross-products: 


Row 

A 

B 

1 

Ci 

(?2 Sum 

I 

lYVa* 1 

IXYVa] 

-[ YFo/a) 

1 

0 

2 


[X-Y 2 ] 

-[XYFo ] 

0 

1 

3 



[FoFo) 

0 

0 


The solution of the normal equations by the usual routine 
described in Section 61 gives A and B , also the reciprocal matrix, 
and the minimized value of £ res 2 . The adjusted values of a and 
b are then 

a = ao — A 
b = b 0 - B 

The extreme left entry in Row III of the solution of the normal 
equations gives S or £ res2 > the residuals all being measured 
entirely in the vertical (i.e., parallel to Oy). Directly above S , 
in Row 5, appears the sum of squares removed by the shift from 
a 0 to a, and in Row 6 the further decrease brought about by adjust¬ 
ing the exponent from box to bx. 

Exercise H. If in the preceding exercise, the x coordinates are 
free of error but the y coordinates have unequal precision, desig¬ 
nated by weight w (varying from point to point), W is no longer 
unity, but is equal to w, which may vary from point to point. 
Table 2 of Section 60 then runs as follows: 


h 

FaNL 

FbHL 

V w-Fo 

Sum 

1 

— V w\Y\/a 

-Xi rw wi 

V a?i(Fi — ae bXl ) 


2 

— V w^Y<ila 

— X 2 K 2 V W 2 

Vu>2(F 2 — ae bXi ) 


etc. 

• 





The approximate values Oq and bo are inserted for a and 6. 




[Ch. X] EXERCISES ON FITTING FUNCTIONS 193 

The normal equations, formed from Table 2 in the usual manner, 
can be symbolized as follows: 


Row 

A 

B 

= 1 

Ci 

Ci Sum 

I 

[u>r 2 /a 2 1 

[wXY 2 /a 1 

— [wYFo/a] 

1 

0 

2 


[uiX 2 }' 2 ] 

-[wXYFo] 

0 

1 

3 



[wFqFq] 

0 

0 


In the solution of the normal equations by the routine of page 
158, the minimized value of £ w(y 0 bs — Vcaic ) 2 comes in the 
extreme left entry of Row III. Just above, in Rows 5 and 6, will 
appear the portions of the weighted sums of squares removed 
successively by adjusting a and then b (see Exercise 3 of Sec. 61; 
also Exercises 1,4, 5,10, 11, 12, and 13 of this chapter). 

The adjusted values of a and b are, as usual, 

a — a 0 — A 

b = b 0 - B 

A and B being found by solving the normal equations. Naturally 
these normal equations become the same as those of Exercise 13 
if w = 1 throughout. 

Exercise 15. (a) The formula to be fitted in the two previous 

exercises can be taken in the logarithmic form 

log y = log a + bx log e 

Suppose now that y' be written for log y , a' for log a, b f for 
b log e; then 

y' = a 7 + b'x 

We now take 

/ = y’ - (a 1 + b'x) 

fo = V' ~ ( a o' + b 0 'X) (V' = log Y ob .) 

wherein a 0 ' means log ao, and b 0 ' means feu log e. The derivatives 



194 


EXERCISES AND NOTES 


[Sec. 67] 


of / are as follows: 6 


fx — , }y> — 1 , fy — 


dy f dy 


r 1 &' 2 , 0 * 4342 

L or ~ri — -1-- I- 2 — 

W W x Wy w x y*w v 

Suppose that x is free of error, but that the weight of an observed 
y coordinate is w, which may vary from point to point. (If all y 
coordinates have equal weight, it is easy to put w equal to 1 in 
what follows.) With w x — the first term in L drops out, 
leaving 

1 0.434 2 

L or — = —o— 

W yw 

It will be noticed from the result of Exercise 8e, page 45, that if 
w is the weight of y , then y 2 w/0AM 2 is the weight of y f or log y. 
Suppose that on this account we set 

*' = = (2 - 30 ^ w 

Then vo is the weight of y', and 

W = w 

If w (the weight of y) is constant throughout, then w y > is not, and 
vice versa. (Cf. the remark appended to Exercise 18.) 

Table 2 of Section 60 will have headings as follows: 


h, or 

Point No. 


fa'/yJL 
= — V W f 


fb'H L 

= ■'“£ V 


Vu>Vo Sum 


The normal equations will be 



6 It is convenient to remember that log e = 1/ln 10 = 0.434 • • • « 
1/2.30 • • •. The symbol log means base 10, and the symbol In means base e 
(logarithme naturel). 







[Ch. XJ 


EXERCISES ON FITTING FUNCTIONS 


195 


We perceive that these are similar to the normal equations of 
Exercise 5, but with w f in place of w , and / in place of F. More 
precisely, the comparison is this: 

In Exercise 5, y = a + bx, x free of error, y of weight w. 

Here, y « a! + b'x,'" “ “ “ y'“ “ w'. 

This means that we may fit the equation y — ae hx by writing it 
in the logarithmic form 

log y — log a + bx log e 

and treating it as a linear equation in log y and x , at the same time 
giving In y a weight just y 2 times the weight of y, or log y a weight 
(2.30 y) 2 times the weight of y. 

Remark. It is customary among computers to fit the expo¬ 
nential equation y — ae bx by taking logarithms and treating 
it as linear in log y and x, but it is not so usual for them to 
change the weighting to correspond to the logarithms. The 
neglect of the factor (2.30 y) 2 not only distorts the results for a 
and b, but also invalidates the reciprocal matrix and all calcu¬ 
lations made with it on the standard error of a function of the 
parameters; moreover, under such circumstances, the extreme 
left entry of Row III no longer contains S. See also the re¬ 
duced type at the conclusion of Exercise 18, page 201. 

(6) The extreme left entry of Row III in the solution of the nor¬ 
mal equations contains w{y obs — y c aic) 2 - The extreme left 
numbers appearing in Rows 5 and 6 are the weighted sums of 
squares removed successively by adjusting a and then b (see 
Exercise 3 of Sec. 61, and Exercises 1, 4, 5, 10-14 of this chapter). 
These statements would not be true if one were to neglect the 
factor (2.30 y) 2 for the weight of log y. 

Note that in the normal equations of part (a) it is permissible to 
use do = 0 and b 0 ' — 0, in which event 

fo = Y' or log y obs 


whereupon the normal equations will be as written below. 


Row o' 

b' 

= 1 

Cl 

c 2 

Sum 

I [«>'] 


[w’Y’) 

1 

0 


2 

[«*'**] 

[w'XY’\ 

0 

1 


3 


[w'Y'Y' ] 

0 

0 



( w f — 2.30 2 y 2 w as on the preceding page. 




196 


EXERCISES AND NOTES 


[Sec. 67] 


These normal equations give a f and b r directly. As in Exer¬ 
cises 1 and 5, no question of an approximate value of b 7 enters for 
the calculation of the w f , since w x is infinite (x free of error), but 
more decimals are required than when good approximate values of 
a! and b r are used. (See Exercise Id.) 

In the solution of these normal equations by the routine exhibited 
in Section 60, the minimized w(y Q bs ~ Vcaic ) 2 appears in the 
extreme left entry of Row III, as usual. Directly above it in 
Row 5 comes the reduction brought about by changing a from 1 to 
its final value, and in Row 6 appears the further reduction accom¬ 
plished by turning the logarithmic line from the horizontal through 
the angle arc tan b f . 

Exercise 16, In fitting the equation 
y = ae bx 

with x and y both subject to error, we may take F as in Exercise 13, 
whereupon 

F 0 ss Y — a 0 e boX (X and Y observed) 


Here we have use for the additional derivative F x = — by, whence 

1 


L or 


_ = *V + 1 

W W X Wy 


If x is free of error, the first term of 1 /W drops out and leaves 
W — w y , the situation assumed for Exercise 14; ii y is free of error, 
the second term drops out and leaves W = w x /b 2 y 2 . 

Since we are here taking the case where x and y may both be in 
error, we set up Table 1 of Section 60 with headings as follows: 


Point 

No. 


F x = -by 


W X Wy 


L or V L or 

\/w i/v w 


Fa = -y/a 


F b 


-xy F o 


From this is formed Table 2 with headings exactly like those of 
Exercise 14 but with w replaced by W. Likewise, the normal equa¬ 
tions will be symbolized as in Exercise 14, w replaced by W. In 
fact, once Table 2 is set up, from then on it is immaterial to the 
computer whether one or both coordinates are subject to error — 
a statement that holds good in any problem of curve fitting. 



[Ch. X] EXERCISES ON FITTING FUNCTIONS 


197 


The solution of the normal equations will give A and B. The 
minimized sum of weighted squares (S) will appear in the extreme 
left entry of Row III, the portions removed by the successive ad¬ 
justments of a and b falling in Rows 5 and 6 directly above S. 
(Cf. Exercise 3 of Sec. 61 and Exercises 1, 4, 5, 10-15 of this chap¬ 
ter.) Here, S = £ {w x V 2 + w y V y 2 ), both x and y residuals 
being present. 

Exercise 17. To use the logarithmic form of the exponential 
(see the preceding exercise) we write 

log y = log a + b'x ( b' — b log e 
= 0.4345) 
or 

y r — a' + b'x 

for fitting the exponential y = ae bx when x and y are both subject to 
error, one would define/ as in Exercise 15; whereupon 

fo = Y’ - (a 0 ' + b 0 'X) 


The derivatives of / are as in Exercise 15. L or 1 /W will now 
have two terms, both coordinates being subject to error; in fact 


L or 


1 _ h — + _L 

W W X Wy> 


w y > being the weight of log y. The normal equations will be 
symbolized exactly like those of Exercise 15a, but with W in place 
of w\ The extreme left entry in Row III will be the minimized 
sum of squares S, with the remarks at the end of Exercise 16 
applying here as well. 

The analogy with Exercise 4 is perfect throughout, as shown by 
the following summary: 


j Exercise 4 
y — a + bx 

1 - + i. 

W w x w v 


Exercise 17 
y — a! + b'x 

1 = ^l 2 + — 

W w x w u . 


All the remarks and notes of Exercise 4 apply here if y is replaced 
by y or log y } a by a', b by b', and w y by uy. 



198 


EXERCISES AND NOTES 


[Sec. 67] 


It is possible that in some problems, w v > might be constant from 
one point to another (in which case w y is not constant); then if w x 
is also constant, we have a situation toward which Remark 2 at the 
end of Exercise 8 (p. 186) is directed. 

Exercise 18. (a) Take 

/ = log y — (log a + b'x) (as in Exercise 15) 

F = y - ae hx ( “ “ “ 13) 

and suppose that x, y, a, and b take on small increments denoted by 
Sx, etc. Prove that 

SF = ySf log e; 

hence at any point, F 0 = yf 0 log e to within higher powers of f 0 or 

F 0 . 

(b) Thence prove that the normal equations in Exercise 14 for 
fitting y = ae bx will give the same curve, i.e., the same results for 
a and b and for ]£ w(y 0 b 8 — Vcaic) 2 , as the normal equations in 
Exercise 15a for the equivalent logarithmic form, except for dis¬ 
crepancies involving the squares and higher powers of residuals, 
the logarithmic form being slightly more accurate. 7 {Hint: 
Note that if A is small, A’ = 0.434/a. Also, so far as a and b 
are concerned, the top normal equation in Exercise 14 may be 
multiplied through by a.) 

The same comparison holds between Exercises 16 and 17. But 
with 6^0, and x and y both subject to error, there is not so much 
advantage in the logarithmic form. 

Remark. It may be worth while to pause for a comment on the 
factor (2.30 y) 2 which is required for the proper weighting of log y. 
Take the one-parameter curve 8 

y = 10 bx 

7 Mr. K. A. Norton pointed this out in one of the author's classes. 

8 This illustration was developed in some correspondence with Professor 
W. L. Gaines of the University of Illinois, extending between 1932 and 1938; 
also in conversations with Mr. G. R. Gause, lately of the Aberdeen Proving 
Ground, now with the War Department in Washington. 



[Ch. X] EXERCISES ON FITTING FUNCTIONS 


199 


and, to make the problem simple, let it be fitted to just two points 
in the xy plane, x being free of error and both y observations of 
equal weight (unity for convenience). 

x y log y 

1 10 1 

2 95 1.9777 

To arrive at the least squares solution, we may write 
F = log y — bx 
F 0 = log y 0 bs ~ box 
F b = -x 

F x = — bo (not needed because the weight of x is infinite) 

F =-i- 
tv 2.30 y 

w = 1 

J _ FyPy _ 1 

w (2.3(h/) 2 

There is only one normal equation, namely, 

,, rwi r F bFb] Z w'x(log y ob , - xb 0 ) 

B = ~ir-rir\- 

wherein w f = (2.30 y) 2 . 

Note: The same equation for B can be derived by saying 
that we seek to minimize 

S = 2 w'(log yobs ~ log y c aic) 2 , w' = (2.30 y) 2 
Replace log y ca ic by bx, or, rather, its equivalent ( b 0 — B)x , 
and get 

S = X w ' (l°g yobs ~ [bo - S]x) 2 

B is the (unknown) quantity which, when subtracted from 
bo, gives the final b. Now differentiate S with respect to B, 
and set this derivative equal to zero. The result is 

2 u>'x (log yobs - [bo - B]x) = 0 

Solve for B, and the result will agree precisely with that shown 
above. 



200 


EXERCISES AND NOTES 


[Sec. 67] 


To continue, we may take b 0 = 1, a value easily found by inspec¬ 
tion. We also replace w' by the correct weighting function y 0 b» 2 y 
and get 

IX 10 2 (1 - 1) + 2 X 95 2 (1.97772 - 2 X 1) 
l 2 X 10 2 + 2 2 X 95 2 


whence 


402.15 

36,200 


= 0.01111 


b = b 0 - B = 1 - 0.01111 = 0.98889 


This is the least squares value of b. 

Now if in performing the solution we had inadvertently taken 
w f = 1, forgetting that the weight of log y is not the same as the 
weight of y, L would have appeared to be constant (unity), 
instead of proportional to y 2 , and the result would have been 

D Z z (log yobs - xbo) 

B -E? 

(1 - 1) + 2(1,97772 - 2) 
l 2 + 2 2 


0.04456 

5 


= 0.008912 


and b would have been 

b = 6 0 - B = 1 - 0.00891 = 0.99109 

The comparison of the sum of squares for the two different 
solutions is shown below. 


1 


Correct weighting 
b = 0.98889 


False weighting 
b = 0.99109 


X 

Vobt 

log Vcalc = 

0.98889a; 

Vcalc 

Vobs 

Vcalc 

(yobs 
Vcalc ) 2 

log Vcalc = 

0.99109a; 

Vcalc 

Vobs 

Vcalc 

(Vcbt — 
Vcalc ) 2 

1 

10 

0.98889 

9.7474 

0.253 

0.064 

0.99109 

9.797 

0.203 

0.041 

2 

95 

1.97778 

95.012 

-0.012 

0.000 

1.98218 1 

95.966 

-0.966 

0.933 



Sum of squares, S - 0.064 

Sum of squares, S — 0.974 




[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


201 


Thus, by weighting log y in proportion to y 0 b 8 2 we obtain a sum 
of squares that is only a fifteenth that obtained by ignoring the 
change in weight. 

The only circumstance under which the factor y 2 may be 
ignored is where the fitted logarithmic line is nearly horizontal, 
for then the weighting factor y 2 is nearly constant from one 
point to another and can be omitted without serious error in 
the formation of the normal equations, the parameters a, b , c, 
etc., being left practically unaltered. 

Even so, the last entry in the “ 1 ” column (Row IV, p. 158) 
is not S, but requires multiplication by an average value of 
(2.30 y) 2 , which might be denoted by (2.30y) 2 . Moreover, 
each element of the reciprocal matrix requires division by 
(2.30 y) 2 before it is to be interpreted as a variance coefficient 
(cf. the remark on p. 195). However, it is interesting to note 
that, owing to compensation, the uncorrected elements when 
used in Eq. 24 (p. 168), along with the external estimate of a 
made from the uncorrected sum of squares, will give the correct 
value for the estimated standard error of a function. 

The factor y 2 takes care of the change in scale that accom¬ 
panies the transfer to logarithms. The student may find it 
helpful to refer back to Fig. 9 on page 45. The y values may 
all have the same weight, but their logarithms do not. No 
matter what function is being fitted, the two terms F x F x /w x 
and FyFy/wy in L or 1/W (cf. Eq. 8, p. 134) can be relied upon to 
perform the same service as (2.30y) 2 does for the logarithmic 
scale. 

This example is an illustration of the fact that if the pro¬ 
cedure of Section 60 is followed, it makes no difference how 
a formula is written. One form will give the same curve as 
another, except for disturbances arising from the neglect of 
second and higher powers of the residuals, but these are not 
usually of much consequence if the data are worth fitting. 

Of course, in some lines of work, the weight of y is approxi¬ 
mately inversely proportional to y 2 , whence the weight of log y 
is practically constant, independent of y. When this is so, the 
weighting factor y 2 is to be omitted. 

Exercise 19. (Yntema’s refinement.) 9 In fitting the curve 
y = ae ix 

9 This device has been taught by Professor Theodore Yntema at the Uni¬ 
versity of Chicago for years. It was first called to my attention by Dr. John H. 
Smith of the University of Chicago (more recently of the Bureau of Labor 
Statistics in Washington). 



202 


EXERCISES AND NOTES 


[Sec. 67] 


with x free of error, we seek to minimize 
s = £ w{y - y c Y 

where w is the weight of the observed y coordinate at a particular 
point, and y c is the calculated ordinate, ae bx . The normal equa¬ 
tions will be obtained by equating to zero the derivatives of S 

with respect to a and b, by which process we find that 

\ 

L w(y - y c ) = ° 

► 

L w (y - Vc) ^ = ° 


Now we may use the logarithmic form by rewriting these 
equations as 

T,8(y)(y' - yc) ^7 yc = o 
Zo(vW-vc’) v' = o 

wherein y = log y, y c r = log y c , o! = log a , b f = b log e, and 
6(y) is such a function of y that the two forms of the equations are 
the same. Evidently it must be that 


dy c 


$(y) = w 


da y - y c 
dy 7 y - yc 
da! 


— w 


dy c y - y c 
dyc y - Vc 


= 2.30 2 wy!y* (y here denotes y 0 b 8 .) 


The last equality is not exact, but is very close, as Professor Yntema 
discovered, and as the student may wish to demonstrate for himself. 

The normal equations are exactly like those in Exercise 15, 6(y) 
now replacing w f . It will be observed that the Yntema refinement 
has merely replaced y 2 in w f by y c V in 6(y). There will be 
scarcely any distinction if the residuals are all very small, in which 
event y c and y (the calculated and observed y values) will be very 



[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


203 


nearly equal, and y C V will be very nearly the same as y 2 . When 
the curve does not fit well, it may be important to take account of 
the Yntema refinement. 

68. The exponential with a linear component 

Exercise 20. Given the equation 

y = ae bx -f - cx + d 
to be fitted to n points. Write 

F ss y — ae hx — cx — d 

The derivatives are 

F a = —e bx , F b = - xae bx , F c = —x, Fa = — 1 

F x = — abe hx - c, = 1 

_1_ _ (abe bx + c) 2 2. 

W Wy 

(The first term of l/W is missing if x is free of error, the second if y 
is free of error.) 

The formation of Tables 1 and 2 of Section 60, and the formation 
of the normal equations and their solution, proceed in much the 
same fashion as heretofore. The only novelty is that here there 
are four parameters, and hence four unknown parameter-residuals, 
A, B, C, and D. The numerical values of F 0 and the derivatives 
F a , F b , F xy etc., for use in Tables 1 and 2, are calculated with the 
approximate values a 0 , b 0 , c o> and d 0 , arrived at somehow (see the 
reduced type at the end of Sec. 55). 

The extreme left entry in Row V of the solution of the normal 
equations will give the minimized £ ( w x V x 2 + w v V y 2 ). The 
entries just above it in Rows 12, 13,14, and 15 will show the reduc¬ 
tions in the weighted sum of squares arising from the successive 
adjustments of a, b, c, and d. 

If only the y coordinates are subject to error, the extreme 
left entry in Row V will give w • res 2 , the deviations being 
measured in the vertical (i.e., parallel to the y axis). Moreover, 
if all coordinates have the same weight (unity), then W = 1 
throughout, and Tables 1 and 2 of Section 60 coalesce. 



204 


EXERCISES AND NOTES 


[Sec. 69] 


With a formula of this kind, there is no possibility of making 
it linear by such a device as taking logarithms, for which reason, 
this problem and others like it have been called insoluble. For¬ 
tunately, the solution is entirely straightforward. 


69. The generalized hyperbola and its logarithmic form 

Exercise 21. Given the equation 
y = ax b 

to be fitted to n points. Here we write 
F s y — ax b 

whence 

Fo — y 0 ba ~ a,Xobs b (a and b being replaced by a 0 and b 0 ) 
The derivatives of F are 



— , F b = -y In x 
a 

jv + ± 

a 2 w x w v 



The headings for Table 1 of Section 60 in the general case 
would be 


h F, = -by/x\w x \w v \L or 1/WULoTl/yjW \\F a - -y/a\F b = -ylnxlFo 


It is easy to make the necessary modifications for special cases. 
Thus, if a: is free of error, then W = w v and the F x and w x columns 
are superfluous; if further, all y coordinates have equal weight 
(unity), then W = 1 for all points, and Tables 1 and 2 will coalesce. 
On the other hand, if y is free of error, then 1/W = b 2 y 2 /a 2 w x and 
the w v column is omitted. From Table 1 is formed Table 2 with 
these headings: 






[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


205 


The usual sums of squares and cross-multiplications from 
Table 2 give the normal equations 


Row 

A 

B 

1 

Ci 

Cl 

Sum 

I 

1' WF a Fa 1 

[WF a F b ] 

~[WF a F 0 ] 

1 

0 


2 


[WF b F b ] 

-[ WF b F 0 ) 

0 

1 


3 



[WFoFo] 

0 

0 



Exercise 22. The equation y = ax b of the preceding exercise 
may be turned into the logarithmic form 

log y = log a + 6 log x 

or y — a’ + bx f (as in Exercise 15.) 

Let / = y - (a' + bx') 

fo as usual 

The derivatives of / are 


0.4346 

* - x ’ 

fv = 

0.434 

y 

/a. = -1, 

h = 

-x' 


Then 


1 »s 

— + — 
W X ’ Wyf 


wherein 
J_ _ 0.434 2 

W x r X 2 W x 


1 


wt. of x f or log x 


(See Exercise 8e on p. 45.) 


— similarly defined. The headings for Table 1 of Section 60 

Wyf 

will be these: 


h, or 

Point No. 






L or 



W x 

w x f 

Wy 

Wyf 

l/W 

yJL 

fx fb fo 


(See the remarks under Table 1 of the preceding exercise. f a > is not listed, 
being constant.) 




206 EXERCISES AND NOTES [Sec. 70] 

The headings of Table 2 will be the usual ones, as in Table 2 of 
the preceding exercise with / in place of F. 

The normal equations will be symbolized precisely like those of 
Set 1 in Exercise 4. In fact all the remarks and notes of Exer¬ 
cise 4 can be translated directly to the present problem. The 
reason is obvious: we have here a line in the variables x' and y 
with weights w x > and w y >. The two terms of L or l/W seen above 
take care of the change in the form of the function from exponential 
to logarithmic. In fact, we could say that 

1 _ fx'fx' fy'fy' __ fxfx fyfy 
W W X > Wy> W X Wy 

as in Exercise 116, page 46. 

Exercise 23. Prove that the normal equations of Exercise 21 
for fitting y = ax b will give the same curve, i.e., the same results 
for a and 6 and hence for S, as the normal equations of Exercise 22 
for the equivalent logarithmic form, log y = log a + 6 log x, ex¬ 
cept for discrepancies involving the squares and higher powers 
of residuals, the logarithmic form being slightly more accurate, 
especially if a; is free of error. (Refer back to Exercises 18 and 22.) 


70. The hyperbola with a linear component 

Exercise 24. Given the equation 

y — ax b + c 4- dx 
to be fitted to n points. Write 

F e y — ax b — c — dx 

Fq at any point is found, as usual, by giving x and y their observed 
values at that point, and a, 6, c, d their approximate values do, 6o, 
Co, do (found somehow; Sec. 55). The derivatives of F are 

F x = —abx^ 1 — d, F y = 1, F a = — x b , F b = —ax b In x, 

F c - ~1, F d = -x 



[Ch. X] 
whence 


EXERCISES ON FITTING FUNCTIONS 


207 


L 


1_ _ (abx*- 1 + d) 2 

W W X Wy 

(W = w y if x is free of error) 


Tables 1 and 2 of Section 60 are made up, and the normal equa¬ 
tions formed and solved, by the usual routine. Row V in the “1” 
column will give the minimized value of S or £ ( w x V x 2 + w v Vy 2 ), 
the successive reductions in the weighted sum of squares appearing 
in Rows 12,13, 14, and 15, as usual (see Exercises 1, 4, 5,10-16). 

The reader should refer back to the reduced type appended to 
Exercise 20, which applies here as well. 

Exercise 25. Given the equation 

u = ax + by c + dz 


u, x, y, and z possibly all being observed. (This equation is used 
by Professor W. L. Gaines at the University of Illinois in his work 
on nutrition and lactation.) Take 

F = u — (ax + by c + dz) 


F 0 as usual 


The derivatives of F are 

F u = 1, F x = —a, F y = —cby c ~ l , F z = -d 
F a = -a;, F b = -y c , F c = —by c In y, F d = -2 
1 = + £ + (cby c - 1 ) 2 + £ 

W W U W X Wy Wz 

Here we have a problem in four dimensions; \/W contains four 
terms. The first term is absent if u is free of error, the second if x 
is free of error, etc. 

The headings for Table 1 in Section 60 would be these: 


h , or 

w u w x by c F v w y w s 

i/w i/v w 

F a - 

F b = 

F c = 

Fd = 

Fo 

Point No. 



— X 

-y e 

-by^ay 

— 2 



Some of these headings will be omitted if any of the u, x, y, or z 
values are free of error throughout. 



208 EXERCISES AND NOTES [Sec. 71] 

Table 2 will be formed by divisions in Table 1, and the headings 
would be as shown below: 


h, or 

Point No. V W'F a ^jW'Fb yJW'F c V W'Fd ^W*Fq Sum 


The normal equations will be formed and solved in the usual 
routine manner (p. 158). Row V will contain the minimized value 
of S , being in this case £ ( w u V u 2 + w x V x 2 + w y V 2 + w z V z 2 ) 
in the “ 1 ” column, the successive reductions in the weighted sum 
of squares appearing in Rows 12, 13, 14, and 15, as usual (see 
Exercises 1, 4, 5, 10-16, 24). 

If only the u coordinates are subject to error, S = ]£ w u V 2 . 
Moreover, if all the u coordinates have the same weight (unity), 
then W = w u = 1 throughout, and Tables 1 and 2 coalesce. 

The second paragraph in reduced type appended to Exercise 20 
applies here (p. 204). 


71. Miscellaneous 

Exercise 26. (a) Given the equation 

u = ax + by + cz 

to be fitted to n observed points. Take 

F = u — (ax + by + cz) 

and show that 

± = ± + a l + *l + l 

W w u W x Wy w t 


Then the normal equations will be symbolized in the form shown. 


Row 

A 

B 

C = 

1 

Ci 

c 2 

C 3 Sum 

I 

[Wxx] 

[Wxy] 

[Wxz] 

-[WxFo] 

1 

0 

0 

2 


[Wyy] 

[Wyz] 

~{WyF o] 

0 

1 

0 

3 



[Wzz] 

— [WzFq] 

0 

0 

1 

4 




IWFoFo) 

0 

0 

0 






[Ch. X] 


EXERCISES ON FITTING FUNCTIONS 


209 


(6) If it is desired to solve for a, b , c directly, the unknowns in 
the normal equations would be a, b , and c, and the “ 1 ” column 
would be 

[Wzu] 

[Wyu] 

[Wzu] 

[Wuu] 

Extra decimals will be required for accuracy, as mentioned on 
page 175. 

(c) Prove that the minimized sum of the weighted squares is 
S = [WF 0 F 0 ] - [WxF 0 ]A - \WyF 0 ]B - [WzF 0 ]C 

If the normal equations are set up to give a, b, and c directly, as in 
part (6), then 

S — [Wuu] — [Wxu]a — [Wyu]b — [Wzu]c 
(See Exercise 3a in Sec. 61.) 

Remark. If u alone is subject to error, and of uniform pre¬ 
cision (unit weight) throughout, the only change in the normal 
equations would be that W would not appear, being unity 
throughout. The minimized sum of squares would be 

S — [uu] — [xu]a — [yu]b — [zu]c 

This equation is used a good deal in some kinds of statistical 
work; see, e.g., p. 160 of the 6th edition of Fisher's Statis¬ 
tical Methods for Research Workers , on which the above equation 
appears as 

S(y - Y) 2 = S(y 2 ) - hSfay) - b^y) - bzS(x,y) 
this being the sum of squares after fitting 
Y = b\X\ + b2$2 + bsXz 
Example 3 in Chapter XI is an illustration. 

Exercise 27. In pharmacology and toxicology, experiments are 
made on a certain number n of organisms or animals to test the 
lethal action of a drug or dosage of X-rays, for various concentra¬ 
tions, or various times of exposure. The proportion killed is 



210 


EXERCISES AND NOTES 


[Sec. 71] 


usually designated by the letter p, and the proportion surviving 
by the letter q . Under the assumption that the susceptibility of 
an individual to a poison is a normally distributed variate, the 
relation of p and q to the deviation y from the average suscepti¬ 
bility may be expressed in terms of the normal integral by the 
equation 


<1 



dt = 1 


- p 


By using a table of the normal probability integral it is possible 
to express q in terms of y. To avoid the use of the negative normal 
deviates that arise when the observed survival q is more than half 
of the animals tested, Bliss 10 has introduced the term probit, and 
has provided tables for conversion. Probits are simply normal 
deviates to which the constant number 5 has been added. If Y 
denotes a probit, then Y — y + 5. The scale in probits runs 
practically from 1 to 9, the 5 in the middle corresponding to the 
center of the normal curve, where y = 0. 

It has been found that when the dosage is expressed in loga¬ 
rithms, and the observed proportion q surviving is transformed 
into probits, then the relation between the log-dosage and the 
probits surviving is approximated by a straight line. The fitting 
of this line, with proper weighting of the points, constitutes an 
important application of least squares. To fit the line by least 
squares, the weights of the probits must be obtained. Now q is 
a proportion, and the assumption is made that the n animals or 
organisms are drawn randomly from some universe, wherefore the 
theoretical variance of q is pq/n. Then by Eq. 8 on page 40, the 
variance of the probit Y can be written 



Show by differentiating the equation relating q to y that 


2 

v 


a Y 


pq 


nz 


2 


10 C. I. Bliss, “ The calculation of the dosage-mortality curve,” The Annals 
of Applied Biology , vol. 22, 1935: pp. 134-167. 



211 


[Ch. X] EXERCISES ON FITTING FUNCTIONS 

where z is the ordinate of the normal curve at the probit Y . Then 

nz 2 

w Y — — 

pq 

is the weight to be applied to the probit Y in fitting the dosage- 
mortality curve. Note that the probit Y is a quantity called a 
percentile of a distribution. If the standard deviation of the 
sampled universe is a, then the 

pqo 2 

Variance of the percentile = —tt 

nz z 

Since probits are by definition expressed in terms of the standard 
deviation of the assumed normal curve, the quantity <j 2 in this 
problem is equal to unity. 



CHAPTER XI 

FOUR EXAMPLES IN CURVE FITTING 

Example 1. Fitting an Isotherm 

72. Formation and solution of the normal equations. For this 
example, data taken by the Michels 1 et al. on carbon dioxide will 
be used. The equation to be fitted is 

, . , o . j 4 (y denotes pv, pressure times 

y = a + bx -f cx 2 + dx* J * 

volume; x denotes density) 

The parameters are not independent but are subject to the condi¬ 
tion that 

y — 1 when x = 1 

This condition arises because of the definition of the unit of volume. 
Because of this condition, 

o = l — b — c — d 

and 

y = 1 + (x — 1)5 4- (x 2 — l)c + ( x 4 — l)d 

Weights: All y coordinates have equal weight; x is free of error. 
Let 

F = y - {1 + (x - 1)6 + (x 2 - l)c + (x 4 - l)d) 
Derivatives: 

F b = -x + 1, F c = -x 2 +1, F d = —x* + 1 
F y = 1, F x is not needed since x is free of error. 

W = 1 at every point. 

l A. Michels, C. Michels, and H. Wouters, Proc. Royal Soc. (London), 
vol. 153A, 1936: pp. 201-224. 


212 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


213 


The following approximate values are known from previous 
experience: 

* &o = -0.006837046 

co = 0.000011392 

d 0 = 0.0 9 1514 

Then 

F q = y ob8 - {1 - 0.00683705 (x - 1) + 0.000011392 (x 2 - 1) 

+ 0.0 9 15(x 4 - 1)| 

TABLES 1 AND 2 
(Formed from the original data) 


Point No. 

-F b 

~F C 

-F d 

-Fo 

Sum 

1 

1.77 X 10 

3.51 X 10 2 

1.24 X 10 6 

0.55 X 10~ 4 

7.07 

2 

2.25 

5.49 

3.03 

0.79 

11.56 

3 

2.72 

7.93 

6.30 

2.07 

19.02 

4 

3.18 

10.74 

11.55 

3.09 

28.56 

5 

3.66 

14.15 

20.04 

6.22 

44.07 

6 

4.14 

17.99 

32.39 

10.05 

64.57 

7 

4.61 

22.20 

49.31 

14.84 

90.96 

Sum 

22.33 

82.01 

123.86 

37.61 

265.81V 


In this example, W — 1 throughout, with the result that Tables 1 
and 2 mentioned in Section 60 are identical. The minus signs in 
the headings avoid minus signs in the table. The powers of 10 
bring uniformity in the denominations of the columns. 

The original data were listed to more decimals than are in¬ 
dicated by the above table, and the normal equations shown 
here, it so happens, were formed from the original data, retaining 
all decimals, then rounding them off to the number shown. 
Exact agreement can not be expected, therefore, with the accu¬ 
mulated squares and cross-products that one would form in the 
usual manner from the table above. The effect on the param¬ 
eters, arising from the use of the extra decimals, is negligible, 
and the conclusions are the same either way. 

The Sum column provides a check, which should never be 
omitted; it is formed regardless of the powers of 10; in fact no 
attention is paid to the powers of 10 until the end, when the 
solution is decoded. After the normal equations are formed, 



EQUATIONS (THE SOLUTION FOLLOWS THE FORM OUTLINED ON PAGE 158) 


214 


EXERCISES AND NOTES 


[Sec. 72] 


>>>> 


<=> 3 S 


°°8g 

I I 


t5 ?oeo 


2 N N 

si I 


Tf N CD CO 

i 

i i 




Jj w N « ^ 


IQ s = 0.036136 236.75 -98.03 22.98 

10*C = -0.015361 -98.03 41.65 -10.08 

10 5 D = 0.3071320 22.98 —10.08 2.55 16.76V 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 215 

the sums in Rows I, 2, 3 are each raised by 100 to take account 
of the entries in the C columns. 

Why is it better to start off with 100 rather than 1 in the 
C columns, for the calculation of the reciprocal matrix? Per¬ 
haps 1000 would have been better than 100. 

Note the symmetry in the reciprocal matrix, which is found 
between the vertical lines in Rows 11, 12, and 13. 

From the normal equations one may make up the following 
tabulation of results. 

B= +0.036 X10 -5 6= - 0.00683705 - 0.0 e 36= -0.00683741 

C = -0.015 X KT 8 c =0.0 4 11392+0.0 7 15 = 0.0 4 11407 
D = +0.307 X10 -9 (i = 0.0 fl 151—0.0 9 307= —0.0 9 156 
.'. a = l — b — c—<2 = 1.00682600 

Est’d S.E. 2 of b = 236.75 X 10- 2 ~V(exO = 19-3 X 10~ 12 

“ “ “ c = 41.65 X 10 -2 -V(ea;0 = 3.40 X 10“ 14 

“ “ “ d = 2.55 X 10 - 2 ~ 10 a 2 (ext) = 0.209 X 1(T 20 

“ “ “ o = <r 2 (ext) (236.75 X 10 -4 + 41.65 X lO -6 

+ 2.55 X 10“ 12 - 2 X 98.03 X 10~ 5 
+ 2 X 22.98 X 10 -8 - 2 X 10.08 X 10 -9 ) 
= 17.7 X 10 -12 

See Eqs. 21 and 22, p. 167; remember that a is a function of 6, c, 
and d. 

See also Exercise 1 ahead. 

Final results for the parameters: 
a = 1.0068260 ± 0.0000042 ' 

b = —0.0068374 ± 0.0000044 Standard errors estimated 

c = 0.0 4 1141 zt 0.0 6 18 from 4 degrees of freedom. 

d = — 0.0 9 156 ± 0.0 9 046 

These standard errors appear to be small compared with the 
parameters. However, it must be noted that they are calculated 
from only 4 degrees of freedom. There is really not much that one 
can say in the way of the prediction of future data, purely on the 
basis of standard errors that have been calculated from a single 
experiment, and in particular if this experiment yields only 4 



216 


EXERCISES AND NOTES 


[Sec. 72] 


degrees of freedom, as is true here. A consistent pattern of small 
standard errors, in experiment after experiment, would begin to 
assume scientific significance, and such is in fact the actual situa¬ 
tion with compressibility data, though the other experiments, and 
the calculations therefor can not be shown here. 

An important consideration was voiced at the outset in Chapter 
VIII, wherein it was stated that the real test of a calculated curve 
comes when it is used as a basis for action. The form of equation 
used here, and the method of fitting, have been tested severely in 
this way. For instance, by means of this equation, the Michels 
have calculated various physical properties of carbon dioxide, and 
they and others have carried out similar calculations for other 
gases, and always the results of these calculations have tied up 
closely with whatever direct experimental work exists on the 
index of refraction, Joule-Thomson coefficient, heat capacities, 
entropy, and other properties, most of which are difficult to meas¬ 
ure directly. Manufacturing processes designed on the basis of 
these calculated physical properties have turned out to be correct, 
thus bearing out the usefulness of the parameters so calculated. 

This statement does not contain any argument that these par¬ 
ticular parameters are better than any other set that could be 
obtained from the given observations. It would take a long run 
of experience in the use of various alternative procedures for fitting, 
in order to decide just what method is better than another. Such 
comparisons probably do not exist. 

It is interesting to see what would be the sum of squares if the 
term dx 4 had been dropped. From Row III we find [cc.2] = 

39.16 X 10 10 ; this multiplied by d 2 or (—0.0 9 156) 2 gives 0.9 X 
10” 8 , which added to 0.33 X 10” 8 gives 1.23 X 10~ 8 for the 
sum of the (— y ca ic) 2 that would be obtained from fitting the 
curve y = a + bx -+* cx 2 . We then find 

10” 8 

<r 2 (ext) = 1.23 X = 0.256 X 10" 8 (dx 4 omitted) 

5 

We already had 

<r 2 (ext) = 0.082 X 10“ 8 (dx 4 included) 

Since the sum of squares, and hence <r(ex0, is so much lower 
when the term dx 4 is included, it appears from the meagre 



(Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


217 


evidence afforded by the four degrees of freedom of this one 
experiment, that one is warranted in carrying this term. 

One may also use Eq. 17 on page 165 for computing the effect 
of dropping d. One thus finds 

“-'•“(sfcrs)’ 

= 0.082 X 10-9 094 x 10-8 


Exercise 1. Show that when corrected for powers of 10 the 
reciprocal matrix is 



236.75 X 10 -4 

-98.03 X 10 _e 

22.98 X 10~ 8 

A" 1 = 

-98.03 X 10“ 5 

41.65 X 10 -6 

-10.08 X 10~ 9 


22.98 X 10~ 8 

-10.08 X 10~ 9 

2.55 X 10~ 12 


These are the figures that were used in writing down the standard 
errors of a , 5, c, and d. Evaluated as a determinant, this gives 
A' 1 = 4.6 X 10- 22 . 


Exercise 2. The evaluation of the determinant of the coefficients 
is 

A = (77.53 X 10 2 ) (53.30 X 10 4 ) (39.16 X 10 10 ) = 0.162 X 10 22 

(See Exercise 1 of Sec. 61, p. 161.) This result is not exact; the 
discrepancy arises from instability, and could be .overcome by 
carrying more decimals. 

Exercise 3. (a) Prove that the standard error of the curve at 

x = 1 is 0, and that at x = 0 it is the same as the standard error 
of a. 

(b) Why is the standard error of the y intercept practically equal 
to the standard error of 6? Argue geometrically and analytically. 

73. A note on instability. As often happens in curve fitting, 
these normal equations are unstable. One of the most sensitive 
tests for instability is to compare the direct solution (already found 
in the “ 1 ” column of Rows 11, 12, and 13) with that given by 
using the reciprocal matrix as a multiplier; by such means we get 
the reciprocal solution (pp. 165-166) 

10R ={ 151.06 X 236.75 - 654.02 X 98.03+1233.8 X 22.98) 10 
= 0.0237 XKT 4 



218 


EXERCISES AND NOTES 


[Sec. 74] 


and in like manner (which the student should undertake as an 
exercise), 

10 2 C = -0.0508 • KT 4 , 10 6 D = 0.2500 • lOr 4 

These are in disagreement with the direct solution found in Rows 
11, 12, and 13, and thus instability is indicated. The direct solu¬ 
tion satisfies the normal equations to the last decimal, but, when 
there is instability, many other solutions not too far away could 
do the same thing. 2 The reciprocal solution, however, does not 
satisfy the normal equations, the actual numbers being 111 against 
151.06, 483 against 654.02, 919 against 1233.79. As indicated in 
Exercise 2, these discrepancies could be overcome by carrying 
more decimals. 

The insidious thing about instability is that its presence may go 
undetected. For instance, if here we had only the “ reciprocal 
solution/' and had not tried to check it by substitution, we might 
have accepted it. The use of the reciprocal matrix as a multiplier 
is in theory very fascinating, but as a practical matter in curve 
fitting we should not wax too enthusiastic about it. Fortunately 
it does work to good advantage in many problems, as seen for 
instance in Chapter V of R. A. Fisher’s Statistical Methods for 
Research Workers. In Section 36 also, the equations were 
stable and no difficulties arose. 


Example 2. Another Polynomial 

BOTH X AND y OBSERVATIONS SUBJECT TO ERROR 

74. The observations and their weights. The polynomial 

y = a + hx + cx 2 (1) 

is to be fitted to points in the xy plane, x and y both subject to 
error. The observations on the coordinates are shown in the 
accompanying table. 

2 See footnotes 10 and 12 in Chapter IX (pp. 160 and 161) for references 
to Tuckerman’s paper and a note by the author on this subject. 



219 


[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 
Table of observations (example 2) 

N denotes the number of observations at a point, a their standard deviation, 
defined by 

= = res 2 (2) 


with a similar equation for the standard deviation of the y observations. 


Point 

No. 

h 

For the z coordinates 

For the y coordinates 

True 

value 

Obs’d 

N 

s 2 

True 

value 

Obs’d 

N 

s 2 

1 

-2 

-2.28 

5 

0.154 

0.11 

0.129 

10 

0.00447 

2 

-1 

-1.13 

6 

.152 

.15 

.131 

9 

177 

3 

0 

-0 44 

7 

.202 

.20 

.198 

8 

264 

4 

l ! 

1.44 

8 

.315 

.25 

.247 

7 

182 

5 

2 

1 90 

9 

.176 

.31 

.312 

6 

105 

6 

3 

2.93 

10 

.124 

.38 

.380 

5 

100 

7 

4 

3.81 

7 

.307 

.45 

.441 

7 

294 

8 

5 

5.07 

4 

.032 

.53 

.529 

12 

286 

9 

6 

6.11 

10 

.343 

.61 

.590 

4 

015 

10 

7 

7.17 

9 

.016 

.70 

.728 

4 

082 

11 

8 

7.83 

7 

.176 

.79 

.791 

7 

244 

12 

9 

9.32 

5 

.154 

.89 

.922 

5 

442 


76. A note on the observed values. This is an artificial example, 
carried out under ideal conditions, in order to combine special 
features of a number of practical examples that might have been 
chosen for illustration. The coordinates observed (the “ true ” 
points) were taken along the curve 

y = 0.2 + 0.05x + 0.003x 2 (3) 

The standard error of a single observation on an x coordinate was 
assumed to be 0.5 cm., and the standard error of a single observa¬ 
tion on a y coordinate was assumed to be 0.05 lb. Artificial 
observations were taken by using Tippett's numbers, assuming 
that the observations are normally distributed, according to the 
table shown in the appendix. Considerable departure from the 
normal distribution would not affect the results appreciably. The 
procedure can be described as follows: 



220 


EXERCISES AND NOTES 


l&nc. 75] 


i. Read out N numbers systematically (i.e., read up, or 
down, or diagonally) from Part A of the appendix (p. 252). 
Each number is a deviation, in units of <r x or <7 V , as the case 
may be, from the true value of whatever coordinate is being 
observed. If desired, Tippett’s Random Sampling Numbers 
(Tracts for Computers, No. 15, Cambridge 1927) can be used, 
in conjunction with the table in Part B of the appendix. N is 
the number of observations on a coordinate, as shown in the 
accompanying Table of Observations. 

ii. Take the average of these N deviations, and multiply it by 
(T x if an x coordinate is being measured, or by a y if a y coordinate 
is being measured. (The numerical values of or* and <r v are to 
be discussed shortly. Assume for the moment that they have 
been settled upon.) 

iii. Add this deviation to the true coordinate to get the ob¬ 
served coordinate, and enter it in the table. 

iv. Compute the variance, or the square of the standard 
deviation, for the observations on each coordinate. These are 
shown as s 2 in the table. The formula is in the heading. 

The question arises how to weight the various values of X and Y . 
For one thing, the weight of any coordinate will be proportional 
to AT, but that is not enough; the precisions of single observations 
are evidently not the same for the y coordinates as they are for 
the x coordinates, judging from the s 2 columns. In order to check 
the x and y precisions for this particular set of observations, as one 
might wish to do in practice, we may plot Fig. 21 to show the 
successive values of s 2 N/(N — 1) for x, and of the same thing for ?/, 
both plotted against x (y would do as well). s 2 N/(N — 1) for x 
(or y) at any point is an estimate of the square of the standard 
error of the single observations on x (or y) at that point. 

Although there is fluctuation of the estimates, there is not too 
much, and there is no trend. 3 Now the weighted average on 
the x plot is not far from 0.25, and the weighted average on the y 
plot is not far from 0.0025, so it seems reasonable to conclude 
that the prior values of precision (viz., <r x — 0.5 cm., a y — 0.051b.) 
should not be changed. In practice, standard errors are usually 

8 In practice one must have enough estimates to enable him to plot a 
Shewhart control chart, before making such statements. However, here we 
have a method (the use of Tippett’s numbers) that in the past has demon¬ 
strated randomness, and these statements can safely be made. 



[Ch. XI] 


FOUR EXAMPLES IN CURVE FITTING 


221 


known pretty definitely from previous experience. Accordingly 
we take 0.5 for the standard error of single observations on x 
and 0.05 for the standard error of single observations on y , over 
the entire range. By recalling that weights are inversely propor¬ 
tional to the variances (p. 21), we see that a single observation on 
a y coordinate has 100 times the weight of a single observation 
on an x coordinate (Eq. 16, p. 22). As a matter of convenience, 



Fig. 21. Estimating the precisions of the observations. The chart shows 
estimates of the squares of the standard errors of single observations. A chart 
of this kind will disclose trends and abnormal variations in the precisions, 
though one should have more points than this at his disposal. 

then, we take unity for the weight of a single observation on x } 
and 100 for the weight of a single observation on y. This is 
equivalent to setting o- = 0.5 for observations of unit weight. 
The values of w x are then the same as the numbers N referring 
to x, in the table of observations, and the values of w y are 100 
times the numbers N referring to y. (If the precision of single 
observations on either x or y coordinates were variable over the 
range of the points, obvious modifications could be made in the 
weighting.) 

76. Formation and solution of the normal equations. We shall 

carry out the steps called for in Section 60. 

1st step: get approximate values for a , b, c. By passing the curve 
through three selected points, approximate values for a 0 , 6 0 , 
cq could be found (see the reduced type on the method of selected 





222 


EXERCISES AND NOTES 


[Sec. 76] 


points, Sec. 55). In this particular case, however, we shall 
instead use the true values of the parameters as approximations. 
They are found in Eq. 3. Accordingly, we write 

oq — 0.2 1 

b 0 = 0.05 [ (4) 

cq — 0.003 J 


Of course, the final results will be the same, no matter what values 
we choose (within reason) for the approximations a 0 , Co (cf 
the reduced type on pp. 137 and 138). 

2d step: the derivatives. For the function being fitted (Eq.l) 
we write 

F = y — (a + bx + cx 2 ) (5) 


and then find the following derivatives: 

F x = — (bo + 2 c 0 X), F y = 1 | 

F a = -1 ,F b = -X, F c = -X 2 J 


whence (see Eq. 8, p. 134), 

T 1 (b 0 + 2c 0 Z) 2 

L or — =- 

W w x 


Wy 


Also we write 

Fo - Y - (ao + boX + c 0 X 2 ) 


( 6 ) 

(7) 

( 8 ) 


Sd step: numerical values ; Table 1. We are now ready to 
calculate the numerical values of l/W, 1/ V W, F a , F&, F c , and 
Fo, and to compute Table 1, which precedes the matrix, Table 2. 

4th step: preparation of the matrix ; Table 2. Now divide the 
values of F ay Fb, F C} and F 0 in any row by the corresponding value 
of 1/V W, and form the sums at the right and at the bottom for 
checks. 

5th step: the formation of the normal equations. The normal 
equations are formed by the usual accumulation of squares and 
cross-products from Table 2. The solution is carried out by the 
routine outlined on page 158, and used previously on pages 82 
and 83, and in the preceding example. 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


223 




oooooooooooo 
+ I + I + + + I 14+1 


GO © CO N 
O! N C> N 
*-H <N © 1 


^ © lO W 

g H U5 fO 

io co 


H 05 ® Tjl 

(N 00 00 N 
CO O O O 
CO 4 CO 00 


O IN CO 00 -tf I 


csirHO—< + csicoiocoi^t^05 

I I I 


. w. ) »o 00 oo © © W 

) 6 cjo©c 2 ©Oihcoi 5 ^ 

)N(DHtp©N»OO00©H 

)f^OSC 0 i 5 ©CD©t-. 00 (M»O 

IMCO^^^^^ioiOUjcO 


i|(NO 00 tDNNHHH(N't 


888888888888 

OC 500 N©iONC^^^nio 

wo©r^oo©©t^rj<o©j>io 


53 § S £ 

co go w S £ 8 

H H ^ W N T}l 


S S* 8 & 5 8 

CO ^ 5 S 5 N 


<OCOI'«.QQ*-<r^<N©COCO©lO 

CO'^’^lOOOt^OOOO©©© 


i-H<NCO^»OCOt>»00©©*H<N 


ie way through, it would ordinarily not be listed m a column. 






224 


EXERCISES AND NOTES 


[Sec. 76] 


TABLE 2 

The fourth step: the matrix for the formation of the normal 

EQUATIONS 

(This comes by divisions performed on Table 1.) 


Point 

No. 

-V W-F a 

—W ■ Ft, 

-ylW-F c 

1 

V W ■ F 0 

Sum 

1 

28.13 

-6.4136X10 

1.4623X10 2 

+7.7104X10 -1 

30.8891 

2 

26.52 

-2.9968 

0.3386 

-4.3307 

19.5311 

3 

25.24 

-1.1106 

0.0489 

+4.9016 

29.0799 

4 

23.20 

3.3408 

0.4811 

-7.2430 

19.7789 

5 

21.90 

4.1610 

0.7906 

+1.3512 

28.2028 

6 

20.17 

5.9098 

1.7316 

+ 1.5632 

29.3746 

7 

21.38 

8.1458 

3.1035 

+ 1.4838 

34.1131 

8 

20.20 

10.2515 

5.2026 

-0.4080 

35.2461 

9 

17.54 

10.7169 

6.5481 

-4.8235 

29.9815 

10 

17.00 

12.1890 

8.7395 

+2.5959 

40.5244 

11 

18.99 

14.8692 

11.6426 

+2.9567 

48.4585 

12 

15.35 

14.3062 

13.3334 

-0.7046 

42.2850 

Sum 

255.62 

73.3692 

53.4228 

5.0530 

|387.4650 V 


Cleared of minus signs and powers of 10, Rows 11, 12, and 13 
lead to the following values of the parameter-residuals A, B, C , 
and to the reciprocal matrix shown below. 


A = -0.00241 ] 

B = 0.004306 [ 

C = -0.000577 J 


(9) 

0.0 3 2767 0.0 6 3 
0.0 6 3 0.0 4 638 

—0.0 6 57 — 0.0 6 84 

—0.0 S 57 
—0.0 6 84 
0.0 5 14 

(10) 


The adjusted parameters will be found by subtracting each 
residual from the corresponding approximate value, according to 
Eqs. 6 in Chapter IV, page 52. The numerical results follow. 

a = oo - A = 0.2 + 0.00241 = 0.20241 ) 

b = b 0 - B = 0.05 - 0.004306 = 0.045694 \ (11) 

c z= cq — C — 0.003 + 0.000577 = 0.003577 J 





Normal equations 


[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


& 


^ CO © 00 


Oo8S3« 


> 


t*- OS 00 © 
<N <N © 

^ I « 


CO 00 ^ 

*0 CO lO 

<N t> 


OO 

r*- t'- 
<N OS 


5 8 

00 

1—i 1—1 

I 


I I 


t 

s 

3 


> 

a s s 


> 

^ 10 s w 

N CO H CO 


CO 

i 


OO OO 


8 


o 

I 


OS iH 

3 2 


°®SS 

1 1 

3 a © 

° g 3 


<N <N l> CO (N (N (N (N 

N N 7 rH I II 1 


S3' 

1 


1 


Os 00 
CM CO 

I 


OS CO 

I 


188 


I I 


CO 

55 8 



^ (N 
** SO 
N 00 N 

sis 

d o h 

I I 


CO 10 CO N OSI l 

I 1 ? s b 


^ CQ O 

1 S 8 


I 

5 a 
■ a ; 
3 <2 
a 

o 


4 


1 H O N H 00 OS O > CO N h 

« a ~ s - 1 - ~ 


225 


* The factor 10 _1 bolds all the way down the column. 





226 


EXERCISES AND NOTES 


[Sec. 77] 


The reciprocal matrix contains the variance and product variance 
coefficients, whence we may write the standard errors of the 
parameters a, 6, c, and of any calculated y value or, in fact, of 
any function of a, 6, c. The diagonal shows that the 

(S.E. of a) 2 = 0.0002767<r 2 1 

(S.E. of b) 2 = 0.0000638(r 2 \ (12) 

(S.E. of c ) 2 = 0.0000014<r 2 J 

With <r = 0.5 (the standard error of observations of unit weight), 
it follows that the 

S.E. of a = 0.0083 1 

S.E. of b = 0.0040 \ (13) 

S.E. of c = 0.00059 J 

whence 

a = 0.2024 ± 0.0083 ] 

b = 0.04569 ± 0.0040 \ (14) 

c = 0.00358 ± 0.00059 J 

77. The reciprocal solution. The reciprocal solution for the 
unknowns A, B, C is obtained as follows: 

-A — 147.18 X 0.0277 - 29.52 X 0.0003 

- 28.95 X 0.0572 = 0.0241 X 10” 1 (15) 

-10 B = 147.18 X 0.0003 - 29.52 X 0.6382 

- 28.95 X 0.8381 = -0.4306 X 10 _1 (16) 

-100C = - 147.18 X 0.0571 + 29.52 X 0.8381 

+ 28.95 X 1.4275 = 0.5766 X KT 1 (17) 

These values of A, B y and C substituted into the left-hand 
members of the normal equations give numbers that are to be 
compared with the right-hand members. The results are shown 
below. 


Row 

Value of the left- 
hand member 

Value of the right- 
hand member 

I 

147.27 

147.18 

2 

-29.60 

-29.52 

3 

28.98 

28.95 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


227 


This close agreement, however, required more figures in Tables 
1 and 2 than were advised on pages 79 and 155. 

78. Adjusting the observations. The calculated point cor- 
responding to the observed coordinates X, Y can now be found. 
We note that there will be a X at every point, given by Eqs. 10, 
page 136. 

X; = Wi(F 0 * ~ FjA - FjB - FJO 

The superscripts on F refer to the point numbers, as they did on 
page 133. We use A = -0.00241, B = 0.00431, C = -0.00058, 
with the values of F 0 , F a , Ft, and F c already entered in Table 1, 
page 223. We shall adjust the observations only at Points 10, 11, 
and 12, for illustration. 

Xj 0 = 00Q 1 3461 {0.01527 - 0.00241 + 7.17 X 0.00431 

- 51.41 X 0.00058) = 4.029 (18) 

Xn = * (0.01557 - 0.00241 + 7.83 X 0.00431 

0.002772 

- 61.31 X 0.00058) = 4.094 (19) 

X 12 =--- {-0.00459 - 0.00241 + 9.32 X 0.00431 

0.004244 

- 86.86 X 0.00058) = -4.055 (20) 


whence, by applying Eqs. 12, page 138, the residuals can be com¬ 
puted at once. The required values of F x are in Table 1. 


At point 10, 


V x = — X 10 F z = l X 4.029 X -0.09302 = -0.0416 
w x 9 

1 4.029 

= 155 - = 001007 


} ( 21 ) 


At point 11, 
V z 


- X 4.094 X -0.09698 = -0.0567 
7 


Vy = 


4.094 


0.0059 


( 22 ) 


700 



228 


EXERCISES AND NOTES 


[Sec. 78] 


At point 12, 



7.0 7.5 8.0 8.5 9.0 9l5 


Fig. 22. An illustration of adjusted observations in curve fitting. The 
calculated curve and the 95 percent error band are shown in the neighborhood 
of points 10, 11, and 12. The error band is laid off above and below the cal¬ 
culated curve. The calculated or adjusted points lie on the calculated curve, 
except for “ errors of closure.” Compare with Figs. 16 and 17, pages 132 and 133. 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


229 


These residuals measured off from the observed points in the 
proper direction (see Fig. 22) give the calculated points , which 
lie on the calculated curve. Actually the points so calculated here 
do not fall exactly on the curve. Such discrepancies are trifling, 
being of second order from the neglect of second and higher powers 
of the residuals. One may simply manipulate the end decimal 
of V x or V v or both, in order to place the calculated point 

* = * ~ V * } (Eqs. 13, p. 138) 
y = y - Vy J 

exactly on the curve. A precisely similar situation arises in 
problems of surveying wherein, for exact satisfaction of the 
geometrical conditions after adjustment, one often needs to manip¬ 
ulate the end decimal of one or more angles and sides (cf. p. 84). 

By adjusting the observations, as is now possible (Eqs. 12, p. 138), 
the residuals can be inspected individually before any conclusion 
is based on S, the summation of w x V x 2 + w y V v 2 . 

Exercise. Compute the x and y residuals for the other nine 
points, and plot them. 

79. The standard error of the calculated ordinates. In accord¬ 
ance with Eq. 22 on page 167, the standard error of the function 
/(a, b , c) is 

<T/ 2 = <r 2 0.000 28 + 0.000 06-1 

+ 0.000 0014 (^) + 2 X 0.000 0003 TT 
\dc / da db 

- 2 X 0.000 0057 77-2X 0.000 0084 % (24) 

da dc db dc\ 

In particular, the calculated y with its standard error for this 
problem would be found by writing a + bx + cx 2 for f(a, b, c), the 
result being 

y = 0.2024 + 0.0457a; + 0.0036x 2 

± <r{ 0.00028 + 0.000 0006x + 0.000 0526x 2 
- 0.000 0168a; 3 + 0.000 0014a; 4 ) * 


( 25 ) 



230 


EXERCISES AND NOTES 


[Sec. 80 ] 


If the factor <r in front of the brace be replaced by 1.96cr, the 
double sign gives the 95 percent error band. This band, laid off 
from the true curve, is expected to embrace 95 percent of the cal¬ 
culated curves that would be obtained at any abscissa x in a large 
number of experiments like this one. Unfortunately, in practice, 
we do not have the true curve, and can only lay off the error band 
above and below the calculated curve, as is done in Fig. 22. The 
band so drawn will vary from one experiment to another (p. 170). 
Moreover, when <j is not known, we can only lay off a “ confidence 
band ” (p. 169), calculated from an estimated value of a (next 
section). It is only when the number of degrees of freedom reaches 
25 or 30 that the width of the confidence band can be interpreted 
as an error band, and even then only in randomness. 

80. Calculation of the external estimate of <r. The external 
estimate of a (Sec. 13) is the sum ( S ) of the weighted squares of 
the residuals, divided by the number of degrees of freedom. Row 
IV in the solution of the normal equations gives 

S = L ( w x V x 2 + WyVy 2 ) = 1.68 (26) 

The number of degrees of freedom is 9, this being the number of 
points (12) diminished by the number of parameters (3), whence 

<r 2 (ext) = | = 0.19 (27) 

Now, in this example, we were furnished with a prior value of a, 
( 0 . 5 ; p. 221 ), and we are thus able to compute x 2 , which we recall 
is simply S/cr 2 (p. 15). We thus find 

2 8 1.68 

X " <r 2 “ 0.5 2 

= 6.72 (28) 

For 9 degrees of freedom the average value of x 2 would be 9. The 
value just obtained is less than the average. Fisher's tables show 

P(x 2 ) = 0.67 (approx.) 

which interpreted means that, in randomness, in 67 out of 100 
experiments x 2 would be greater than 6.72. 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


231 


Example 3. A Formula Useful in Forestry 4 

81. The formula to be fitted. This example serves the purpose 
of illustrating three features: i. the fitting can be done with 
logarithms, the constant or nearly constant characteristic being 
suppressed or nearly suppressed to cut down the number of 
figures required; ii. W is constant throughout; iii. the prior 
value of <r can be expressed in terms of some of the parameters, 
so that, finally, the minimized S can be transformed into x 2 > 
and the fit of the formula judged on this criterion. All three 
features owe their existence both to the form of the fitted function 
and to the experimental material and procedure. One or more 
of them, however, is likely to be encountered in other work. If 

x — the volume of a tree in board feet 
y = the merchantable height of the tree 
z = its diameter at breast height 
then experience has shown that the equation 

x = ay h z c (29) 

predicts satisfactorily 5 the values of x from observations on y 
and z. 

The particular set of data for consideration in the present prob¬ 
lem consists of 66 points — measurements on the volume, mer¬ 
chantable height, and diameter, of 66 trees. It will not be neces¬ 
sary to display the full set of points for the discussion intended 
here; the first six and the last will be sufficient. They come in 
no particular order of size. The logarithms are written in the 
three right-hand columns, for convenient inspection, since they 
will be needed in the fitting. 

82. Rewriting the function to gain an advantage. Looking at 
the logarithms in the table of observations, we perceive that the 

4 This problem was furnished by Mr. Jesse H. Buell of the Forest Service, 
Asheville. 

6 Francis X. Schumacher and F. dos S. Hall, J. Agric. Res., vol. 47, 1933: 
pp. 719-734; also Donald Bruce and Francis X. Schumacher, Forest Mensura¬ 
tion (McGraw-Hill, 1935), Art. 140. 



232 


EXERCISES AND NOTES 


(Sec. 82] 


first part of the figures, the “characteristics,” do not vary much; 
in fact, in the Y' and Z' columns the characteristic is unity all 
the way down. What we need to do is to write the formula so 
that the variable part of the logarithms is brought into prominence. 
This can be accomplished by writing the formula as 

x' = o' + by' + cJ ( x f = log x, etc.) (30) 


Data for Example 3 



Observations 

Logarithms 



Volume 

Height 

Diam. 




Point 

X 

Y 

Z 

X' = 

Y* = 

Z' - 

No. 

(board 

feet) 

(feet) 

(inches) 

log X 

log Y 

log Z 

1 

60 

25 

13.8 

1.778 

1.398 

1.140 

2 

60 

24 

14.0 

1.778 

1.380 

1.146 

3 

120 

29 

18.1 

2.079 

1.462 

1.258 

4 

270 

38 

21.0 

2.431 

1.580 

1.322 

5 

320 

37 

21.6 

2.505 

1.568 

1.334 

6 

130 

30 

16.5 

2.114 

1.477 

1.218 

66 

320 

54 

18.8 

. 

2.505 

1.732 

1.274 




Sums 

152.136 

102.451 

84.090 


and thereupon lowering the characteristics of x\ y', and z' by the 
harmless device of subtracting and adding unity to each logarithm, 
arriving finally at the form 

*" = a" + by " + cz" (31) 

where the double primes denote suppressed logarithms, namely, 
= x' - 1 = log x - 1 
y" = 2/' - 1 = log y - 1 
z" = z 9 - 1 = log z - 1 
a!’ = a f - l + b-fc = loga — 1 + b + c 


( 32 ) 



[Ch. xij four examples in curve fitting 


233 


83. Formation and solution of the normal equations. By trans¬ 
posing the formula all to one side, we have the acceptable form 

/ = x n — ( a! 1 + by" + cz N ) (33) 

The derivatives of / are as follows: 

fx' — fy' ~ by f z t = C 

/«" = - 1 , h=~y", fc=-z" 


Then 


(34) 


L = i- = —+”+— (SeeEq.8, p. 134.) 
W W x f Wyt W Z ' 


W x r 


+£ 

W v f w z t 


= 0.434 2 


f 1 

, JL 

\x 2 w x 

1 2 

y^v 


z 2 w 2 


(35) 


the last step coming from Exercise 8e on page 45. 

In this investigation, and in related experience, the standard 
errors of x, y, and z have been found proportional to the quan¬ 
tities measured. To be specific 

The S.E. of x is 7 percent of x 

“ “ “ y is 6 “ “ y 

“ " “ z is 5 “ “ 2 


It follows, then, from Eq. 13 on page 21 that 



wherefore 

1 /0 434\ 2 

- = | (0.07) 2 + (0.066) 2 + (0.05c) 2 } (37) 

which is constant throughout, independent of x, y } and z. This 
is the second of the two important features described above. 



234 


EXERCISES AND NOTES 


[Sec. 83] 


Now a is open to arbitrary choice, since weights are not abso¬ 
lute but are relative only (p. 22); and a convenient choice is 
to put 

<r 2 = 0.434 2 {0.07 2 + (0.066) 2 + (0.05c) 2 } (38) 

whereupon W becomes unity at all points. The value of <r, in this 
problem, is not needed until at the end, when it will be com¬ 
pared with cr(ext). (See Sec. 14, p. 29.) What is more im¬ 
portant at present, b and c will not be needed for the calculation 
of W, in spite of the fact that x, y , and z are all subject to error. 
From a computational standpoint, this is a fortunate situation, 
resting on the peculiar combination of the form of the fitted func¬ 
tion and the standard errors of x, y, and z. 

As it happens, W being constant (unity) throughout, the same 
results for a, 6, and c would come from normal equations set up 
under the (incorrect) assumption that only the measurements 
on volume are subject to error, and that they are of unit weight: 
But estimates of the parameters, however important, are not the 
whole problem; one ought also to consider the adjustment of the 
observations for a study of the trends (if any) in the residuals; 
one ought also to know the minimized S for considerations of 
the fit of the formula, as, for example, by comparing a (ext) with 
the prior a, which fortunately is at hand in this example as it 
was also in the preceding one. If the errors in the diameter and 
merchantable height are masked, none of the residuals in volume, 
merchantable height, or diameter can be found; moreover, the 
entry in Row IV of the solution, which should be S , is instead an 
unknown multiple of it, wherefore the possibility of reconciling 
the known experimental conditions with the fit of the curve is 
lost or put on a basis that is likely to do more harm than good. 

Approximate values of a, 5, and c (hence also of a"), after 
being found by some method or other (see pp. 137 and 138), or 
being known from previous experience, would be used in cal¬ 
culating the value of 

fo = X n — (oo" + boY n + cqZ") 

at each of the 66 points. The capitals refer to the observed 
values of log x — 1, log y — 1, log z — 1. 



|Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


235 


Since W = 1 throughout, Tables 1 and 2 of Section 60 coalesce, 
and the normal equations are symbolized as follows. A f \ B } 
and C are the parameter-residuals. 


Unknowns 



Row 

A" 

B 

c 

= 1 

Sum 

I 

66 

[Y"\ 

\z" 1 

-Lfol 


2 


[K"r"l 

[Y"Z") 

~[Y"f o] 


3 



[Z"Z"\ 

~[Z"fo] 


4 




l/o/ol 



Since most of the adjustment is already contained in the approxi¬ 
mate values of a, b, and c, a maximum of two figures would suffice 
in any column of X ", Y", Z", or/ 0 ; and a maximum of three 
figures would likely suffice in the normal equations. Such simpli¬ 
fication is our compensation for the trouble of computing / 0 at 
each point. 

The solution would proceed as on page 158. Row IV will con¬ 
tain the minimized S, correctly distributed among the residuals 
in volume, height, and diameter. The reciprocal matrix found 
in Rows 11, 12, and 13 will contain the variance and product 
variance coefficients of a", 6, and c. 

As an exercise, the reader might express the variance coefficients 
of a and a in terms of Cn, C12, etc., found in the reciprocal matrix 
in the solution of the normal equations for A", B, and C. 

84. Numerical results. Instead of using approximate values of 
a, b f and c, and computing an/ 0 at each point, Mr. Buell had already 
adopted the somewhat longer process of using a^' = 5 0 = Co = 0, 
/o = X ', and solving for a', 6, and c directly. His normal equations 
are symbolized as follows, directly in terms of the logarithms 
(F' = log Fob.; etc.). 


Unknowns 


Row 

t 

a 

b 

C 

= I 

Sum 

I 

66 

IY' ] 

{Z'\ 

[A'] 


2 

3 

4 


[y'y'l 

[Y'Z' ] 
[Z’Z') 

[K'X'l 
[Z'X') 
IX'X'] 




236 


EXERCISES AND NOTES 


[Sec. 84] 


Numerically, his equations were these: 

Unknowns 


Row 

a 

b 

c 

= 1 

I 

66 

102.451000 

84.090000 

152.136000 

2 


159.921325 

131.022337 

237.985322 

3 



107.853544 

195.795651 

4 




356.809522 


The solution was found to be 

a' = -1.78222 = log a, a = 0.01652 
b = 0.87476 
c = 2.14226 

Hence the relation found was 

x = 0.0165t/°- 876 z 214 (40) 

Not having at hand the complete form of solution, and in par¬ 
ticular, not having S as it would appear in the form of solution 
shown in Section 61, page 158, we shall here make use of Exercise 3 
of Section 61 (see also Exercise 25 of Sec. 70), thus getting 

S = 356.809522 + 152.136000 X 1.78222 

- 237.985322 X 0.87476 

- 195.795651 X 2.14226 

= 0.324 (41) 

It will be noted that *8 is here the small remainder left over from 
the addition and subtraction of relatively much larger numbers. 
To secure two figures in S, one must carry a, b , and c through the 
fourth decimal; this is so in spite of the fact that we can not pos¬ 
sibly rely statistically on so many figures in a, 6, and c, a fact that 
would be evident from their standard errors or from forest measure¬ 
ments in general. This situation is to be contrasted with the 
relatively few figures that would be required for the normal equa¬ 
tions if good approximate values a^ r , bo, and cq had been used for 
the calculation of/ 0 at every point; with good approximations, the 
sum [fofo] would itself be close to the minimized S, so that the 
correction terms need not be carried far. The reader will realize 




[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


237 


that this matter has been stressed earlier (see, e.g., pp. 153,175, 
180, 182, and 209). 

We can now make the external estimate of <j from the value of S 
computed above, using Eq. 21, page 28, with the result that 

O A OOJ. 

<r 2 (ext) = 66 _ 3 = = 0.00514 (42) 


This is to be compared with the prior a 2 , which from the choice 
made in terms of b and c on page 234 turns out to be 


a 2 = 0.434 2 {0.07 2 + (0.06 • 0.875) 2 + (0.05 • 2.142) 2 } 

= 0.00356 (43) 


Thus a 2 (ext) is about 50 percent larger than the prior a 2 . The 
possibility of this comparison is the third feature mentioned at the 
start. 

A more exact comparison of the two estimates of a can be made 
as follows. First of all, we need x 2 - By the definition of x 2 on 
page 15, 


x 


2 


S 0.324 
cr 2 “ 0.00356 


(44) 


Since tables of chi-square do not run so high as 63 degrees of free¬ 
dom, we use Fisher’s function 6 

V27 - V2 k -1 

which works out to be 2.3, giving P a little over 0.01. This is a 
little low, signifying that it might be well to look carefully at the 
data for inhomogeneities of various kinds. 

It would be interesting to make a study of the residuals as func¬ 
tions of x, or y, or z, but we shall not stop here except to indicate 
how the residuals would be computed. From Eqs. 10, page 136, 
we have 

X = X" - (a" + bY" + cZ") (at any point) 

6 This remarkable function is written at the bottom of Table III in Fisher's 
Statistical Methods for Research Workers (Oliver and Boyd), all editions. 
When k is large, say above 30, it is distributed very nearly as a normal deviate 
with unit standard deviation. 



238 EXERCISES AND NOTES [Sac. 85] 


whence by Eqs. 12 on page 54 the logarithmic residuals would be 


V x > = — U = 

w ~/ 


_X_ 

W X ' 


^ 0.434 - 0.07\ 2 


)\) 

Tr X , \b /0.434 • 0.06\ 2 

Vyf = - Jy> - -- — I - I AO 

Wyt Wy> \ <T ) 

Tr X , Xc /0.434 0.05\ 2 

V z > = —/*' =-= - (-) Xc 

W t > W z f \ <7 / 


(45) 


. d/ d/ . 

SmC0 d? = d? 7,etC * 

Certain special features peculiar to this problem have been 
mentioned, and the remaining details will be omitted; the reader, 
however, will profit from Professor Schumacher’s comments on 
the foregoing. 


86. Comments from Professor Francis X. Schumacher, 

Duke University 

(a) The number of figures required in the solution of Mr. Buell’s 
normal equations could be cut down by the calculation of an / 0 at 
every point, as emphasized in Section 84, but perhaps a more 
effectual saving of labor would follow upon transferring the 
origins of coordinates from the natural zeros to the logarithmic 
means X 7 , T 7 , Z 7 . We know from the first normal equation of 
either of the sets on page 235 that the fitted plane will pass 
through the logarithmic means, which is to say that the final values 
of a, b y and c will satisfy 

X 7 = a' + bT + cZ 7 

The transfer of the origins will not only cut down on the number 
of figures required, but will also eliminate the parameter a' and 
reduce the number of normal equations by one, leaving only b and 
c as the unknowns, a' to be found afterward by noting that 

a' = X 7 - bY 7 - cZ 7 

The new sums and cross-products (to be denoted by appending 
the sign ° to the brackets) would be found by making the 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 239 

following reductions from Mr. Buell's equations: 

[ Y 'y']° = 159.921325 - 102.451 2 /66 = 0.887880 
[Y'Z']° = 131.022337 - 102.451 X 84.090/66 = 0.490450 
\Z'Z']° = 107.853544 - 84.090 2 /66 = 0.715240 
[Y'X']° = 237.985322 - 102.451 X 152.136/66 =* 1.826453 
[Z'X']° = 195.795651 - 84.090 X 152.136/66 = 1.960557 
[X'X']° = 356.809522 - 152.136 2 /66 = 6.122213 

Four decimals will suffice, whereupon Mr. Buell's normal equa¬ 
tions (p. 236) reduce to the following set, which can be solved as 
shown. 


Row 

b 

c — 

1 

Sum 

I 

0.8879 

0.4904 

1.8265 

3.2048 

2 


0.71S2 

1.9606 

3.1662 

3 

Factors 


6.1222 

9.9073 

4 

-0.55231 

-0.2708 

-1.0088 

-1.7700 

II 


0.4444 

0.9518 

1.3962 V 

5 

-2.05710 


-3.7573 

-6.5926 

6 

-2.14176 


-2.0385 

-2.9903 

III 


S 

0.3264 

0.3264V 

8 


b** = 

0.8742 


7 


c = 

2.1418 

3.1418 V 


The values of b and c just obtained agree well enough with those 
on page 236, but with fewer figures and less trouble; and the same 
can be said for the sum of squares 0.3264 seen in Row III. Other¬ 
wise obtained, 

S = 6.1222 - 0.8742 X 1.8265 - 2.1418 X 1.9606 
= 0.3263 

affording an interesting check. (The two figures 0.3264 and 0.3263 
for S show a numerical comparison of the two expressions in parts 
c and a respectively of Exercise 3 on pp. 163 and 164.) 



240 


EXERCISES AND NOTES 


[Sec. 86] 


(b) The following suggestion is offered here in the hope of 
fostering first approximations as a preliminary to the real work of 
fitting by least squares. If the merchantable portion of the tree 
stem were of the same geometrical form in all tree sizes, the volume 
would vary directly with the height and as the square of the 
diameter. Hence useful approximations should be 

b = 1 

c = 2 

a' = X 7 — Y f — 2 Z' (not needed in the plan just outlined) 

The problem is then seen as that of finding the effect of changes 
produced by the form of the merchantable solid upon tree volume. 

Example 4. A Sample Survey of Canned Goods 

86. Object of the survey. This example is described here, 
because the solution has a wide diversity of application in sample 
surveys; in fact, the solution given here has already been found 
useful in other fields. Of course, each new problem carried with 
it a multitude of theoretical and administrative details that are 
new and different, and these must be worked out and tailored to 
the new requirements. 

In laying plans for allotments of canned goods for the year 1943, 
the question of current inventories of distributors arose and was 
referred to the Census, with the thought that sampling might be 
introduced to decrease the number of inquiries involved, and the 
expense attached thereto, and — what is more important often¬ 
times — to decrease the time interval between the collection of 
the data and the completion of the tables. A solution in the form 
of a sample was provided by Messrs. Morris H. Hansen and 
William N. Hurwitz, and was tried out in the Bureau of the 
Census. The country was divided into 24 areas; and within any 
one area, the establishments were divided into classes of five 
different sizes, depending on their inventories of canned goods on 
Date I. An inventory was taken of the stock in every store on 
Date I, but inventories of only a sample of stores were taken on 



2= INVENTORY AT LATER OATE 


[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


241 



Fig. 23. Inventories of canned peas on two dates for selected sample stores. 
Each point represents a store. The four lines are drawn to show the calcu¬ 
lated relations for the four different classes. The first class is missing, be¬ 
cause in this area no store of the first class had canned peas. 




242 


EXERCISES AND NOTES 


[Sec. 86] 


Date II, a month later. The sampling scheme diminishes the 
amount of reporting at Date II by requiring reports from all the 
stores in the 5th size class (the highest), but from only half of 



Fig. 24. Inventories for the fifth class, and the line through the centroid. 
Each point represents a store. The dashed wedge shows two standard devia¬ 
tions of the slope, calculated from Eq. 57. 

them in the 4th size class, a quarter of them in the 3d, an eighth of 
them in the 2d, and a sixteenth of them in the 1st or lowest class. 7 

From the sample, it was possible to make a usable estimate, for 
each area, of the stock that would have been recorded by taking an 

7 In order to produce reliable estimates of inventories on Date II, by area, 
the sampling ratios were changed for smaller areas, depending on the sizes of 
the stocks on hand. The figures just given constitute a typical set of sampling 
ratios for one of the largest areas. The size class in which a store belongs is 
determined by the number of cans of all kinds of goods on hand — peas, 
beans, soup, meat, etc., — but the analysis is carried out for each commodity 
separately. This explains why the size classes for peas overlap in Figs. 23 
and 24: a store with a large over-all inventory may be small in peas, and 
vice versa. 




[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


243 


inventory of all the stores on Date II. Moreover, the design of 
the sample was such that the reliability of these estimates could 
be made on the basis of some initial trial samples, and sharpened 
by further trials. Some of the underlying theory can be ap¬ 
proached in terms of curve fitting, and will be presented as such 
here. A forthcoming publication by Messrs. Hansen and Hurwitz 
will contain many other interesting aspects of the problem, par¬ 
ticularly from the sampling angle. 

87. What the sample gives. Let the subscript i denote the size 
class. There being five size classes in any area, i will run through 
the values 1, 2, 3, 4, 5. Let the subscript j refer to a particu¬ 
lar store in the ith class. Then j will run through the values 
1 , 2 , • • •, m. 

m is the number of establishments sampled in the 
ith class 

N{ is the total number of establishments in that 
class 

Xij is the inventory (number of cases of peas 8 on 
hand) in the jth establishment of the ith class 
on Date I (known for every store) 

Zij is the inventory of this same store on Date II 
(known only for the stores that are in the 
sample) 

Ni 

Xi = ]£ x *j - the complete inventory of all stores in the ith 
i “ 1 class on Date I. (In this summation, j runs 
from 1 to Ni, to include all the stores in the 
class. Xi is known.) 

n% 

Xi = £ x a - the inventory of just the sample stores in the 
jml ith stratum, on Date I. (In this summation, 
j runs from 1 only to n,-, since only the sample 
retailers are admitted in this sum. is 
known.) 

8 For convenience, the analysis will be carried through with reference to 
canned peas, though obviously any other commodity or commodity group 
could be substituted. 



244 


EXERCISES AND NOTES 


[Sec. 88] 


Ni 

Zi = z»j } similar to except that this is for Date II. 
jwml {Zi is to be estimated.) 

fH 

Zi = £ Zijy similar to *»•, except that this is for Date II. 
J '“ 1 (z t * is known.) 

5 

Z = £ Z t - == the established inventory on Date II for the 
$ " 1 sum of all the five classes in this area. 

88. The estimated inventory and its standard error. One way 

to estimate the inventory for class i on Date II is to say that it is 
proportional to the inventory of that class on Date I, in accordance 
with which we write 


Zi = biXi (46) 

Then 

Z — £ Z» = Z\ + Z 2 + Z 3 + Z 4 + Z 5 (47) 

i-i 


is the estimated total inventory of all five classes in the area on 
Date II. Curve fitting enters the problem in the determination of 
usable values of 6 t from the sample of stores in each class, and in 
the calculation of the variances of the estimated inventories. 

The assumption will be that, except for accidental influences, 
such as weather, delayed shipping schedules, and mistakes in 
counting, all the stores of size class i would increase or decrease 
about the same relative amount between the two dates. This 
assumption, in this problem, has been found to lead to useful 
results. Of course, outside this particular field, or under other 
conditions, the same assumptions might lead to difficulty. It 
is only by careful investigation that one is able to say in advance 
under what conditions his assumptions will lead to usable predic¬ 
tions. Of course, the assumption that the inventories are each 
about the same on the two dates will be found violated by many 
individual stores, but on the whole it will be close enough for the 
purpose intended. 

In evaluating the error in the slope b*, we recognize the existence 
of accidental influence of variation in both Xij and z» ; . The bigger 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


245 


the inventory, the bigger the standard error of the accidental 
variations. Hence we shall put 


wherein <r 2 , as usual, is the standard error of observations of unit 
weight. We might write similar equations for the weight and 
standard error of za (inventory of a sample store on Date II), but, 
if the two inventories x t; - and z*y are not greatly different, it 
will be sufficient to make the x and z weights equal, thereby writing 


v>i.. = w x . = ~ 

%i *» Xi 


The standard error resulting from the accidental influences on 
Xi on Date I can be found as follows: 

Xi = £ (52) 

l 

Hence by the result obtained in Exercise 2 on page 42, 

2 _ 2 j_2 I I „ 2 

**1 ~ **<1 + ^.2 + “ ‘ + °*> N . 

~ (e«i + ^»2 + * • * + z»w.) ° 2 — ° 2 Xi (53) 

We here take 


L = w. + iZi 

w x w. 


(at point ij; Eq. 8, p. 134) 


= + l = 1 ± b < 2 

W x W z Wii 


Wij is here written for the weight of either or z»y. 



246 


EXERCISES AND NOTES 


[Sec. 88[ 


The normal equation for 6» is shown below. The subscript ij 
on w y Xy and z is omitted for convenience, and the sums (£) are 
taken over the sample stores (i.e., j runs from 1 to n», while i 
remains constant). 


bi 

= i 

c 

23 wx 2 

23 wxz 

1 

1+^ 

l+bi 2 


The resulting solution for the slope is 


r 23 wxz 
6< = 2>? 


(Compare with Eq. 34 on p. 31.) 


Z z ij _ 

]C x ij x i 


(56) 


since w tif Xij is to be counted equal to unity (Eq. 50) and Xi is 
the inventory of the n t - sample stores on Date I. Note that by 
the value of 5* just obtained, the line is to be drawn from the 
origin to the centroid of the m points. The variance of is seen 
from the normal equation to be 


1 + hi 2 2 1 4“ fr* 2 _2 

2 & ^ 

23 war 


Xi 


(57) 


Now for the estimated inventory on Date II we recall that 

ZZi= ibiXi (47) 

*-i i-i 


whereupon, by the result obtained in Exercise 3 on page 43, it 
follows that 




( 68 ) 


The first term in the brackets arises from the sampling error in 
the slope which will vary from one set of sample stores to 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


247 


another. The second term arises from the fluctuations to be 
expected in the complete inventory Xi at Date I. 

It remains to estimate a 2 . It is best to do this for each class 
separately. Since both xa and z t; - are assumed to be subject to 
the influence of chance fluctuations, we measure the residuals from 
each point perpendicularly to the line z = b{X , and write 

jm . . rf o . 

a 2 (ext) = £ ——t (Cf. Eq. 21, p. 28) (59) 


= E 


3 = 1 


resjj 2 

Xij (ti% 1 ) 


( 60 ) 


The factor n t * — 1 arises from the reduction of n,- by unity for the 
single parameter b{. 


Perhaps the simplest way to evaluate the sum of the weighted 
squares of the residuals (the summation over j called for in the 
formula just written) is to measure each residual graphically, 
square it, and divide by the value of xa as read at the foot of 
the perpendicular dropped from the observed point to the line. 

Another but theoretically less exact method of evaluating 
this summation would be simply to calculate the residuals from 
the line by the formula 

Residual = — biXij (61) 

as if they were measured in the vertical. The sum of the 
weighted squares calculated with vertical deviations will be 
about half the sum of the weighted squares calculated with 
perpendicular distances, and the factor 2 can be applied to 
compensate. 


It will be sufficient for the purpose to set 6, in the brackets of 
Eq. 58 equal to unity, an assumption already made in the weights, 
and justified by the slopes in Fig. 23. With this simplification it 
is found that 

( 62 ) 

The five terms called for in the summation over i on the right- 
hand side are the five separate values of {a Zi /Zi) 2 for the five 
inventories Z u Z 2 , Z z , Z 4 , Z 5 . 



248 


EXERCISES AND NOTES 


[Sec. 89J 


Fig. 23 shows a plot of the points for canned peas for the stores 
in the 2d, 3d, 4th, and 5th classes, in the state of New York. 
For this particular commodity, there was no store in the 1st class. 
The units of measurement are designated by the scales. The 
slopes of the four lines are & 3 , & 4 , and 65 . The scatter of the 
points is more than one might hope for, but the method gives 
useful results nevertheless. 

Fig. 24 shows the points for the 5th class separately, with a 
wedge laid off each side of the line to show the width of two esti¬ 
mated standard errors. This wedge is indistinguishable from the 
95 percent confidence band. 

89. Summary of the errors to be considered; effect on sample 
designs. 9 There are two kinds of problems that arise in sampling 
inventories, and we might designate them as Problem A and Prob¬ 
lem B. Problem A consists simply of sampling a pile of schedules. 
Every store in an area (e.g., New York state) has presumably 
sent in an itemized schedule showing the number of cans of peas 
and other commodities on hand at a certain date. In the discus¬ 
sion that now confronts us, this date was Date II, but this is 
unimportant so far as the description of Problem A is concerned. 
The question is how to find, by sampling, a number (an estimate) 
that for purposes of action can be used in place of the total inven¬ 
tory of peas contained in the entire pile of schedules. This is the 
problem, regardless of whether the responses written on the 
schedules are correct or not; and the error in the sample estimate 
will be the difference between that estimate and the actual count 
contained on the schedules, whether it be right or wrong. 

The number of stores is large, perhaps in the thousands. The 
reason for taking the sample would be to hasten the processing 10 
of the data, and to get it done for less money. The pile of inven¬ 
tories might be so big, and the deadline so short, that there is time 

9 The author is exceedingly indebted to Messrs. Hansen and Hurwitz, not 
only for permission to use their example, but more especially for assistance 
rendered in numerous discussions, during which the recognition and evaluation 
of the five different sources of errors were evolved. 

10 The term “ processing of data 77 refers to office operations in the nature 
of editing, coding, transcribing, punching and tabulating, posting and con¬ 
solidating, in the production of final tables or summaries. 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


249 


to work with a sample, but not with the complete count. So far 
as Problem A is concerned here, there are two mutually exclusive 
sources of error, which may be outlined below. 


i. The stores that are drawn into the sample are designated 
as the sample stores. In one particular sample survey these are 
a particular set of stores. But if the sample were redrawn from 
the same universe of stores, there would be a different set of 
sample stores, and another estimate Z of the total count. It 
follows that there is a sampling error in the estimated total 
count, and a sampling error in the calculated variances, arising 
from the selection of sample stores. 

Messrs. Hansen and Hurwitz have made an approximate 
evaluation of this source of error, and their result is 




[Ni- 1 


. Ni / 

-E(* 

i]™l \ 



(63) 


If the factor ( N% — rii)/{Ni — 1) is replaced by unity, this 
error is seen to be about half the fourth source of error men¬ 
tioned below. This factor, incidentally, reduces the first source 
of error to zero when all the schedules of a class are processed, 
for then Ni — Ui = 0. 

ii. The first source of error can be decreased by using an ap¬ 
proximate relationship between x and z, provided one exists. 
An assumption is useful if it makes useful predictions. If some 
other set of assumptions turns out to be better for purposes of 
prediction, and if the extra cost involved in office procedure is 
not too great, a change might be warranted. A change in the 
assumption of a relationship will produce different results, not 
only in the estimated total inventories on Date II, but also in 
the estimated variance of that total inventory. 

Sources i and ii do not both exist simultaneously. Messrs. 
Hansen and Hurwitz, in their evaluation of the sampling error 
(mentioned above), did not make use of any assumed relation¬ 
ship; hence their formula applies to source i only. There is no 
way of evaluating the second source of error analytically, even 
when it exists. 

The effect of either or both of these two sources of error can be 
reduced to any desired degree by taking a big enough sample. 
Messrs. Hansen and Hurwitz wished particularly to reduce 
the error in the fifth class (the largest inventories); hence 
they took all of it. A 10 percent error in the fifth class would 
amount to as many cans of peas as a 50 percent error in one of the 
lower classes. 



250 


EXERCISES AND NOTES 


[Sec. 89] 


Problem B includes some other aspects that need to be con¬ 
sidered when one takes into account the influences that affect the 
figures recorded on the schedules. In this problem, the inven¬ 
tories for Date II would ordinarily be collected on a sample basis, 
and the reports that have been received up to a certain deadline 
date would be processed. The action that is to be taken (policies 
in distribution) will affect all the stores in the area, those in the 
sample, and those not. A number of sources of error must be 
considered. 

iii. Late reports introduce an error. It would of course be 
dangerous, if not folly, to assume that the late reports are a ran¬ 
dom sample of the universe. No attempt is made here to eval¬ 
uate the bias arising from late reports. 

In a sampling project, the total number of reporting stores 
may be so small that individual attention can be given to them, 
to reduce the proportion of late reports and the uncertainty 
introduced by them. For instance, one might send out tele¬ 
grams just before the deadline to bring in some of the reports 
that threaten to be delinquent. Moreover, one might subse¬ 
quently follow up some of the late reports, to decide, on the basis 
of empirical evidence, which way and how much the late reports 
affect the estimates. 

iv. There are random errors in the responses of the sample 
stores, and there are fluctuations in their inventories owing to 
extraneous natural influences (such as the weather and freight 
tie-ups in and out), all of which throw the points away from 
whatever relationship may otherwise exist between the inven¬ 
tories on the two dates. 

This source of error is the first term in the brackets of Eq. 58, 
and it is seen to be smaller as z» increases, which is to say that the 
fourth source of error grows smaller as the sample grows bigger. 

This source of error, unlike the first and second, can not be 
reduced to zero by taking all the stores in any class. 

The 2-sigma band in Fig. 24 is calculated from Eq. 57 and 
corresponds to the first term in the brackets of Eq. 62. This 
band shows how large a sample must be taken to reduce the 
fourth source of error to some desired degree. An assumed 
relationship other than the one adopted would lead to another 
band. 

Deliberate errors of reporting are usually not random, and 
constitute an insidious problem of another kind. It is conceiva¬ 
ble that under-reporting cancels out, as when Z t - and z» are both 



[Ch. XI] FOUR EXAMPLES IN CURVE FITTING 


251 


reported as just 50 percent of their true values; 6< would be 
twice as big, but Z t would be unaffected. 

v. Random errors of response, and extraneous natural influ¬ 
ences are present, not only in the sample stores, but in the 
reported inventories of all the stores. As a consequence, the 
total inventory of any class is affected. The effect of this 
error on the total inventory is evaluated by the second term 
in the brackets of Eq. 58. 

vi. There is an error in Problem B arising from the assump¬ 
tion of a particular relationship and weighting. This corre¬ 
sponds to the second source of error, mentioned under Problem 
A. Again, there is no way of evaluating this source analytically. 



APPENDIX 


Tables for Making Random Observations for Class 
Illustration 

Each number represents one observation. The numbers may be 
taken out in any order — across, cornerwise, or in any systematic 
fashion that does not make use of the size of the number. 

Part A: Normal Deviates Directly in Units of the 
Standard Error 

(This table comes from a paper by Edward L. Dodd, Boletin 
Matematico, Buenos Aires, Ano xv, 1942: pp. 76-7, with the kind 
permission of the author and editor. These numbers were ob¬ 
tained from a transformation of the first two pages of Tippett’s 
Random Sampling Numbers.) 


- 0.64 

0.42 

- 0.26 

2.04 

0.83 

0.23 

- 0.48 

0.16 

- 0.21 

1.67 

- 1.02 

- 1.08 

0.58 

0.09 

- 1.13 

- 0.61 

- 0.60 

0.67 

- 0.41 

- 0.59 

- 0.37 

- 1.23 

0.50 

0.74 

- 1.59 

0.06 

- 1.22 

0.28 

0.26 

0.89 

- 0.19 

1.16 

- 0.60 

1.37 

- 1.08 

1.30 

0.53 

0.28 

1.18 

0.37 

0.22 

- 0.57 

0.00 

- 0.97 

- 0.55 

0.30 

0.27 

- 0.59 

1.46 

- 0.69 

- 0.41 

0.11 

1.14 

0.26 

0.01 

- 0.62 

- 0.84 

0.79 

- 0.96 

0.68 

- 1.72 

1.01 

1.15 

0.56 

1.72 

- 0.57 

1.58 

- 0.34 

- 0.64 

1.19 

- 0.86 

0.39 

0.93 

- 1.00 

- 0.87 

0.01 

- 0.41 

0.55 

- 0.26 

0.57 

0.17 

- 0.38 

1.45 

0.33 

0.36 

0.61 

0.81 

0.79 

- 1.27 

0.49 

- 1.17 

0.40 

- 0.77 

0.00 

- 1.45 

- 0.70 

0.48 

0.03 

0.17 

- 0.31 

0.63 

- 0.40 

0.97 

0.37 

- 0.83 

- 0.77 

- 0.03 

0.63 

- 0.56 

- 1.02 

0.24 

0.12 

0.14 

- 0.06 

- 0.64 

0.07 

- 0.89 

0.38 

0.19 

1.72 

- 0.12 

- 0.91 

- 0.44 

- 0.26 

1.63 

- 0.19 

- 0.57 

2.62 

- 0.79 

0.38 

0.17 

1.48 

0.73 

- 0.97 

0.11 

0.73 

0.50 

- 0.22 

0.63 

0.48 

- 1.19 

0.59 

- 1.14 

0.02 

0.93 

- 0.22 

- 0.22 

1.65 

0.17 

- 0.27 

0.24 

- 0.44 

0,81 

- 0.33 

0.24 

1.98 

- 0.60 

- 0.20 

0.00 

1.11 

1.54 

0.56 

- 0.46 

- 1.45 

- 0.54 

- 1.27 

0.35 

- 0.13 

- 0.40 

0.33 

0.26 

1.54 

- 0.43 

- 1.24 

- 1.05 

- 0.05 

1.34 

- 0.98 

0.02 

- 1.50 

0.39 

- 1.33 

- 0.10 

- 0.18 

1.42 

1.47 

- 0.58 

0.55 

- 0.24 

- 0.82 

0.33 

- 0.65 

0.77 

- 0.32 

- 0.55 

0.72 

- 2.36 

0.53 

- 1.12 

- 0.86 


252 



APPENDIX 253 


1.05 

1.87 

0.63 

- 2.92 

1.72 

- 0.90 

- 2.27 

- 0.88 

1.57 

1.41 

1.17 

1.53 

- 0.08 

- 1.75 

0.44 

0.14 

- 1.19 

- 0.37 

0.25 

- 0.51 

1.29 

0.15 

- 0.72 

0.89 

- 1.47 

- 0.25 

- 0.24 

- 1.02 

- 0.90 

— 1.09 

0.06 

- 0.14 

0.35 

- 0.25 

- 0.31 

- 0.78 

0.91 

- 0.12 

0.34 

- 0.39 

- 1.07 

0.57 

- 0.34 

- 0.98 

- 1.52 

- 0.40 

- 0.14 

- 0.50 

0.54 

1.27 

0.53 

0.17 

0.70 

0.44 

- 0.92 

- 0.48 

- 0.40 

- 0.59 

- 1.38 

1.31 

1.21 

0.04 

1.01 

1.35 

0.81 

- 0.02 

0.19 

0.30 

0.40 

1.08 

- 0.41 

1.33 

1.16 

- 0.38 

1.27 

- 1.42 

0.35 

- 2.75 

- 0.64 

0.25 

- 0.67 

0.61 

1.15 

- 3.89 

- 0.90 

- 0.70 

- 0.36 

0.05 

- 2.01 

- 0.70 

2.08 

0.42 

1.93 

- 0.97 

1.38 

- 1.08 

- 0.52 

1.04 

0.60 

1.56 

- 0.57 

- 1.46 

- 1.24 

- 0.49 

0.67 

- 2.01 

0.81 

- 0.72 

- 0.61 

- 0.02 

- 0.68 

- 0.29 

0.05 

- 0.91 

- 0.77 

1.54 

- 1.61 

- 0.87 

- 0.92 

- 0.81 

- 1.18 

- 0.66 

- 0.68 

1.76 

- 2.47 

- 0.32 

2.22 

0.02 

- 0.28 

- 0.08 

0.78 

0.39 

1.35 

- 0.48 

- 0.22 

- 0.35 

- 1.23 

- 0.76 

- 0.96 

- 0.74 

- 0.33 

0.91 

- 1.11 

— 1.06 

- 0.91 

0.75 

0.15 

0.57 

1.60 

- 1.99 

0.77 

0.19 

0.31 

1.75 

1.78 

- 0.80 

0.75 

- 0.81 

0.01 

- 1.59 

- 0.06 

- 0.12 

- 0.60 

0.85 

- 0.09 

- 1.18 

0.61 

0.97 

- 1.66 

0.52 

- 0.49 

0.01 

- 0.03 

- 1.01 

2.10 

0.47 

- 0.01 

- 0.52 

- 1.06 

- 0.58 

- 1.05 

- 1.30 

0.61 

- 0.31 

- 0.02 

2.02 

0.66 

- 0.25 

- 0.56 

- 1.59 

- 0.26 

0.88 

- 0.72 

- 1.00 

0.57 

- 1.95 

- 0.75 

- 0.23 

- 0.47 

2.07 

- 1.15 

- 0.57 

0.63 

- 0.25 

- 0.34 

1.03 

- 0.64 

0.12 

0.11 

- 0.02 

- 1.33 

0.56 

- 0.05 

- 1.12 

1.13 

- 0.77 

- 0.45 

- 0.73 

- 0.22 

2.25 

- 0.06 

- 0.12 

- 0.94 

- 0.65 

0.78 

- 1.88 

1.26 

- 0.79 

0.77 

- 1.44 

- 1.16 

0.58 

0.49 

- 0.20 

- 0.36 

0.02 

- 0.13 

- 0.50 

- 1.45 

- 0.01 

1.39 

- 0.30 

1.37 

- 0.80 

0.45 

1.81 

0.88 

- 0.49 

0.08 

- 1.29 

- 1.53 

0.38 

- 0.04 

1.13 

1.19 

1.07 

- 2.24 

- 0.06 

0.85 

1.42 

- 0.80 

- 0.76 

0.01 

1.10 

- 0.72 

0.76 

1.60 

0.58 

1.00 

- 0.26 

- 1.34 

0.07 

- 1.80 

0.07 

- 0.25 

- 1.94 

1.44 

0.20 

0.18 

- 2.72 

0.79 

1.89 

0.07 

- 0.19 

2.25 

- 0.02 

- 1.29 

1.35 

- 0.67 

- 0.19 

- 1.02 

- 2.01 

- 2.28 

0.39 

1.11 

- 0.07 

0.42 

- 0.87 

- 0.49 

1.37 

0.29 

0.23 

- 1.46 

- 0.28 

0.46 

- 1.06 

1.35 

- 0.06 

2.34 

- 0.31 

0.64 

- 0.60 

- 0.12 

0.84 

1.10 

- 0.42 

0.82 

- 0.04 

- 0.17 

1.79 

0.18 

2.27 

- 0.69 

- 1.34 

- 0.22 

0.43 

- 1.41 

0.29 

- 0.21 

- 1.00 

- 0.56 

- 1.46 

- 1.13 

1.28 

- 1.02 

— 1.05 

- 0.12 

- 0.68 

- 0.36 

0.08 

0.72 

0.56 

- 1.27 

- 0.70 

- 0.09 

- 1.54 

2.16 

- 0.14 

0.17 

0.73 

- 1.18 

- 0.73 

0.44 

- 0.37 

2.12 

1.40 

- 1.64 

0.38 

0.09 

1.25 

0.39 

- 1.84 

0.81 

0.92 

- 1.37 

0.87 

0.18 

- 0.86 

- 0.97 

- 0.18 

- 0.89 

0.08 

- 0.38 

1.27 

0.66 

- 0.30 

- 1.12 

- 1.17 

- 0.43 

1.22 

- 0.27 

0.46 

- 0.95 

- 1.85 

0.01 

- 0.25 

- 0.81 

2.08 

1.10 

- 0.44 

0.58 

0.29 

0.79 

- 0.83 

0.04 

- 0.84 

- 0.54 

- 1.98 

0.73 

- 0.57 

- 0.85 

1.28 

- 1.27 

0.60 

0.74 

- 0.21 

0.89 

0.60 

2.54 

- 1.38 

0.06 

- 0.27 

0.84 

- 1.37 

0.20 

- 0.13 

- 1.84 

2.38 

0.00 

0.77 

0.51 

- 1.36 

- 0.85 

2.26 

2.06 

0.81 

- 0.22 

0.48 

- 1.99 

- 1.11 

0.65 

1.20 

- 0.99 

0.79 

1.65 

- 0.23 

- 1.26 

- 0.53 

0.27 

- 1.30 



APPENDIX 


254 


- 0.81 

- 1.19 

- 1.63 

- 2.00 

- 1.62 

- 0.24 

0.64 

0.64 

- 0.44 

- 0.32 

0.49 

- 0.66 

1.27 

- 0.49 

1.41 

- 0.72 

1.96 

0.84 

- 0.78 

- 0.80 

0.60 

- 0.60 

- 0.42 

- 0.37 

2.00 

0.14 

1.62 

0.16 

0.19 

0.33 

0.48 

0.89 

0.42 

0.46 

- 0.63 

- 0.19 

0.48 

- 0.03 

- 0.12 

0.76 


- 0.31 

0.32 

0.24 

0.47 

1.37 

0.35 

0.40 

- 1.01 

0.80 

0.52 

0.67 

1.75 

- 0.35 

- 1.78 

- 1.76 

- 0.47 

0.55 

2.04 

- 0.56 

0.18 

- 0.81 

- 0.43 

- 1.78 

0.24 

- 0.07 

0.13 

0.12 

- 2.07 

- 0.37 

0.31 

- 0.14 

- 0.54 

- 0.02 

- 2.48 

1.15 

1.01 

1.67 

- 0.38 

- 0.28 

0.60 


- 1.53 

- 1.48 

- 1.08 

0.71 

- 0.81 

0.95 

- 0.21 

- 1.72 

0.80 

0.60 

1.33 

- 0.61 

- 1.25 

0.34 

- 0.99 

0.99 

- 1.02 

0.89 

- 1.01 

0.76 

- 0.14 

1.67 

- 0.27 

0.60 

- 0.32 

- 0.48 

0.61 

0.52 

- 0.52 

0.30 

2.55 

1.86 

- 1.76 

1.23 

- 0.17 

1.28 

0.40 

0.74 

- 1.11 

0.56 


- 0.75 

0.07 

- 0.02 

- 0.08 

- 0.36 

0.39 

- 0.39 

- 0.32 

- 0.10 

1.74 

- 0.01 

- 1.00 

1.27 

- 0.20 

- 0.15 

- 2.08 

- 0.13 

- 0.88 

- 0.02 

2.41 

- 1.84 

1.80 

1.07 

0.22 

- 0.34 

- 0.15 

- 0.27 

0.60 

1.02 

- 0.83 

- 0.36 

0.96 

- 0.06 

0.64 

- 1.15 

- 1.60 

2.52 

1.02 

- 0.03 

0.42 


APPENDIX 


255 


Part B : Normal Distribution of the Numbers from 0000 
to 9999. Class Interval .2 a . 

(This table is to be used in conjunction with Tippett's numbers, 
in circumstances where a longer series than that in Part A is 
required, or where it is desired to use pages of Tippett's numbers 
other than the first two.) 


Interval 
Center Limits 

Cumulative 

area 

Area of 
interval 

Intervals for 
Tippett’s numbers 
( 0000 - 9999 ) 

Center 

of 

interval 

- 3 . 8a - 

— 00 

— 3 . 7a - 

0 

0.000 1078 

0.000 1078 

0000 

-— 3 . 8a 

- 3.6 

- 3.5 

.000 2326 

.000 1248 

0001 

- 3.6 

- 3.4 

- 3.3 

.000 4834 

.000 2508 

0002-0004 

- 3.4 

- 3.2 

- 3.1 

.000 9676 

.000 4842 

0005-0009 

- 3.2 

- 3.0 

- 2.9 

.001 8658 

.000 8982 

0010-0018 

- 3.0 

- 2.8 

- 2.7 

.003 4670 

.001 6012 

0019-0034 

- 2.8 

- 2.6 

- 2.5 

.006 2097 

.002 7427 

0035-0061 

- 2.6 

- 2.4 

- 2.3 

.010 7241 

.004 5144 

0062-0106 

- 2.4 

- 2.2 

- 2.1 

.017 8644 

.007 1403 

0107-0178 

- 2.2 

- 2.0 

- 1.9 

.028 7166 

.010 8522 

0179-0286 

- 2.0 

- 1.8 

- 1.7 

.044 5655 

.015 8489 

0287-0445 

- 1.8 

- 1.6 

- 1.5 

.066 8072 

.022 2417 

0446-0667 

- 1.6 

- 1.4 

- 1.3 

.096 8005 

.029 9933 

0668-0967 

- 1.4 

- 1.2 

- 1.1 

.135 6661 

.038 8656 

0968-1356 

- 1.2 

- 1.0 

- 0.9 

.184 0601 

.048 3940 

1357-1840 

- 1.0 

- 0.8 

- 0.7 

.241 9637 

.057 9036 

1841-2419 

- 0.8 

- 0.6 

- 0.5 

.308 5375 

.066 5738 

2420-3084 

- 0.6 

- 0.4 

- 0.3 

.382 0886 

.073 5511 

3085-3820 

- 0.4 

- 0.2 

- 0.1 

.460 1722 

.078 0836 

3821-4601 

- 0.2 

0 

0.1 

.539 8278 

.079 6556 

4602-5397 

0 

0.2 

0.3 

.617 9114 

.078 0836 

5398-6178 

0.2 

0.4 

0.5 

.691 4625 

.073 5511 

6179-6914 

0.4 

0.6 

0.7 

.758 0363 

.066 5738 

6915-7579 

0.6 

0.8 

0.9 

.815 9399 

.057 9036 

7580-8158 

0.8 

1.0 

1.1 

.864 3339 

.048 3940 

8159-8642 

1.0 

1.2 

1.3 

.903 1995 

.038 8656 

8643-9031 

1.2 

1.4 

1.5 

.933 1928 

.029 9933 

9032-9331 

1.4 

1.6 

1.7 

.955 4345 

.022 2417 

9332-9553 

1.6 

1.8 

1.9 

.971 2834 

.015 8489 

9554-9712 

1.8 

2.0 

2.1 

.982 1356 

.010 8522 

9713-9820 

2.0 

2.2 

2.3 

.989 2759 

.007 1403 

9821-9892 

2.2 

2.4 

2.5 

.993 7903 

.004 5144 

9893-9937 

2.4 

2.6 

2.7 

.996 5330 

.002 7427 

9938-9964 

2.6 

2.8 

2.9 

.998 1342 

.001 6012 

9965-9980 

2.8 

3.0 

3.1 

.999 0324 

.000 8982 

9981-9989 

3.0 

3.2 

3.3 

.999 5166 

.000 4842 

9990-9994 

3.2 

3.4 

3.5 

.999 7674 

.000 2508 

9995-9997 

3.4 

3.6 

3.7 

.999 8922 

.000 1248 

9998 

3.6 

3.8 

00 

1 

000 1078 

9999 

3.8 




INDEX 


(The numbers refer to pages . 

Adjusted observations, 2,16; weights 
of, 21, 33, 46, 66, 68, 85, 90 
Adjusted parameters, weights of, 16, 
167 

Adjusting sample frequencies, 96 ff.; 
by iterative proportions, 115; by 
the Bruy&re method, 124; by the 
Stephan method, 121; when only 
one cell requires adjustment, 
119 

Adjustment, formulas for, 52, 74, 
138; geometry of, 132, 133, 142, 
144, 228; nature of, 13; numerical 
illustration, 81, 227; plane triangle, 
7, 60, 74; procedure, 6, 139, 140; 
segments of a line, 8, 86 
Adjustment for bias, 10, 127 
Adjustment of parameters, formulas 
for, 52, 136; second, 52, 180 
A. C. Aitken, 16, 19, 160, 173 
Alternate hypothesis, 28 
Analysis of variance, 27, 29 
R. L. Anderson, 173 
Approximate values of parameters, 
51, 52, 137, 153; method of aver¬ 
ages, 137; method of selected 
points, 138; method of zero sum, 
137 

Auxiliary constants, 91, 94; number 
of, 95 

Averages, method of, 137 
Bessel, 27, 60 

Bias, adjustment for, 10, 127; detec¬ 
tion of, 11 

M. D. Binoham, 159 
Raymond T. Btrge, 27, 29, 88, 169, 
173, 175 


Proper names are in capitals .) 

C. I. Bliss, 210 

Maxime B6cher, 58 

Donald Bruce, 231 

Paul T. Bruy$jre, method, 123 ff. 

Jesse H. Buell, 231, 238 

Calculated curve, 16, 18, 132, 133 
Calculated points, 18, 130, 132, 133; 
possible and impossible positions, 
146 

Calculation of mean and standard 
deviation, 150; rapid method of, 
151 

Norman Campbell, 137 
Cauchy distribution, 39 
Cauchy method for obtaining ap¬ 
proximations of parameters, 138 
Cell, definition of, 98 
Cell frequency, 98; adjustment of, 
99; estimation of, 101 
Centroid, 174, 181; see also Quasi 
center 

Chio’s pivotal expansion (of deter¬ 
minants), 161 

Chi-square, definition of, 15, 22, 27, 
88; distribution of, 141 
Chi-test, 15, 18, 132 
Closure, error of, 84, 109, 228 
Condition equations, 50, 51, 132; 
derivatives of, 51; geometric, 
59 ff., 70, 76; numerical values of 
derivatives, 71; reduced, 53; with¬ 
out parameters, 59 
Confidence bands, 169, 170, 171, 
230; see also Error bands 
Consistency; see External consist¬ 
ency and Internal consistency 
Controls, 100 
257 



258 


INDEX 


Corner check (sum check), 72, 155 
Correlates, or correlatives, 53 
Correlation coefficient, 177 
Covariance (product variance), 19, 
160 

Curve; calculated, 18, 132, 133; 
dosage-mortality, 210; true, 132, 
133 

Curve fitting; an isotherm, 212; 
graphical considerations, 130, 143; 
miscellaneous functions, 208; poly¬ 
nomial, 172, 218; the exponen¬ 
tial and its logarithmic form, 191; 
the exponential with a linear com¬ 
ponent, 203; the hyperbola and 
its logarithmic form, 204; the 
hyperbola with a linear component, 
206; the line, 173; the normal 
equations, 136; the parabola, 187; 
the purpose, 128 
Emanuel Czuber, 142 

Data, object of taking, 1 
Datum, arbitrary, 151 
Harold T. Davis, 173 
Degrees of freedom, 18, 34, 141 
Deviates, normal, 210, 252 
Edward L. Dodd, 252 
M. H. Doolittle, 56, 157 
Dosage-mortality curve, 210 

J. F. Encke, 60 

Equations of condition; see Con¬ 
dition equations 

Error, in transformation to loga¬ 
rithms, 44; of closure, 84, 109, 228 
Error bands, 168, 171, 230; see also 
Confidence bands 

Errors of sampling; see Sampling 
errors 

Estimate of <r, by external consist¬ 
ency, 27, 28, 34, 230; by internal 
consistency, 29; unbiased, 27, 168 


Exponential; logarithmic form, 191; 

with linear component, 203 
External consistency, 27, 28; see also 
Internal consistency 
External estimate of <r compared 
with prior a, 230 
Mordecai Ezekiel, 138 

Face totals, 112 

R. A. Fisher, 18, 22, 28, 30, 34, 
168, 169, 171, 172, 209, 218, 230, 
237 

Forest mensuration (example), 231 
Lester R. Frankel, 32 
Freezing (near indeterminacy of 
equations), 160 

Frequency (cell frequency), 98 

W. L. Gaines, 198, 207 
G. R. Cause, 198 
Gauss, 15, 16, 27, 53, 54, 56, 58, 59, 
60, 90, 140, 141, 157, 159, 160, 
167, 180 

Gauss brackets, 54 
Gauss symbols, 157 
General normal equations, 55 
Geometric conditions, 59, 70, 76; 
solution without Lagrange multi¬ 
pliers, 62 

Corrado Gini, 184 
M. A. Girshick, 159 
Goodness of fit, 18 

Graphical considerations of curve 
fitting, 130, 143 

F. dos S. Hall, 231 
Morris H. Hansen, 47, 240, 243, 
248, 249 

Hatchability of eggs, 32 
J. F. Hayford, 65 
Helmert, 142 
Harold Hotelling, 159 
E. E. Houseman, 173 
David Hume, 12 



INDEX 


259 


William N. Hurwitz, 240,243,248, 
249 

Hyperbola, generalized, 204; loga¬ 
rithmic form, 205; with linear 
component, 206 
Hypothesis, alternate, 28 

Indeterminacy (near indeterminacy), 
160, 161, 163 

Information (Fisher), amount of, 22 
Instability, 160, 217 
Internal consistency, 29; see also 
External consistency 
Inverse matrix; see Reciprocal matrix 
Isotherm, fitting of, 212 
Iterative proportions; method, 115; 
simplification when only one cell 
requires adjustment, 119; three 
dimensions, 117; two dimensions, 
115 

Jacobi, 159 
Jacobian, 166 

Truman L. Kelley, 159 
J. M. Keynes, 12 
Charles H. Kummell, 64,140,145, 
184 

L coefficients, 55, 59, 134, 135 
Lagrange, 53 

Lagrange multipliers; calculations 
of, 91, 92, 93, 104, 109, 110, 136; 
method of, 53; normal equations 
for, 82, 86, 109, 111; number of, 
55, 95; solution without Lagrange 
. multipliers, 62 

Least squares; computation for 
fitting curves, 158; formulation of 
general problem, 49 ff.; method 
of, 15; principle of, 2, I4~ld 
O. M. Leland, 64 
G. J. Lidstone, 27 
Line segment, 8, 86 
Line of worst fit, 184 


Logarithmic form; generalized hy¬ 
perbola, 204; of the exponential, 
191, 193; special remarks, 195, 
198, 201 

Marginal total (rim total), 98 
Matrix; for formation of normal 
equations, 72, 155; notation, 166; 
preparation of matrix, numerical 
examples, 79, 222, 224; reciprocal 
matrix, 19, 91, 159, 160, 162, 217; 
see also Reciprocal matrix 
Maximum error, 161 
Tobias Mayer, 137 
Mean, 18; rapid method of calcula¬ 
tion, 151; standard error of mean, 
21, 40; weighted, 19, 24, 26 
Mean square error; of a difference, 
42; of a sum, 42; percentage mean 
square error, 43; propagation of, 39 
Method; of averages, 137; Bruy ere, 
123, 124; Cauchy, 138; least 
squares, 15; selected points, 138; 
Stephan, 121, 123; zero sum, 137 
Michels, 128, 212 

Near indeterminacy, 160, 161, 163 
V. A. Nekrasoff, 169 
Normal deviates in units of standard 
error, 252 

Normal equations; direct solution, 
18; exhibit in symbols, 158; ex¬ 
ponential, 192 ff.; for conditions 
without parameters, 73, 106; for 
curve fitting, 136; for fitting an 
isotherm, 214; for geometric con¬ 
ditions, 59; for the line, 174 ff.; 
formation, 70 ff., 79, 155; freezing, 
160; general, 55; hyperbola, ,205, 
206; in a numerical example, 82; 
number of equations, 56; parabola, 
188 ff.; tabular solution, 19, 33, 66, 
148,150; unequivocal solution, 156 
K. A. Norton, 198 



260 


INDEX 


Observations, rejection of, 171 
Order, importance of, 3, 4 
Orthogonal functions, 10, 172 

A. de Forest Palmer, 27, 160 
Parabola, 187 

Parameters; adjustment of, 130; 
approximate values, 52, 131, 137, 
153; methods of obtaining, 137, 
138; weight, 167 
Karl Pearson, 27, 184, 186 
Percentile, 211 
P. Pizzetti, 142 
Plane triangle, 7, 60, 64 
Positive definite (normal equations), 
58 

Powers of 10, manipulation of, 224 
Precision, 23 
Prediction, 3, 129 
Principle of least squares, 2, 14, 16 
Prior value of <r compared with <r 
(ext), 230 

Probable error, 169 
Probits, 210 

Product variance (covariance), 19, 
160 

Propagation of error; in functions 
of one variable, 37; in functions of 
several variables, 38 
Propagation of mean square error, 
40; of variance, 40; of weight, 40 
Ruth R. Puffer, 22 

Quadratic form, 68 
Quasi center, 181 

Randomness; importance of order, 
3, 4; Shewhart criterion, 6, 220; 
tables for making random obser¬ 
vations, 252 

Reciprocal matrix, 19, 21, 156; 
as, a multiplier, 91, 218; calcula¬ 
tion of, 92, 162, 201; for the line, 
176,185; for the parabola, 190; in a 


numerical example, 92, 217, 224; 
order of, 95; see also Matrix 
Reciprocal solution, 160, 165; in a 
numerical example, 226 
Reduced conditions, 52 
Rejection of observations, 171 
Residual, 14; calculation of, 56, 151, 
158, 216, 230; definition of, 17, 
50; standardized, 22, 23; variance 
of, 127; see also Sum of squares 
Rim total (marginal total), 98 

G. Robinson, 137, 159, 167 

Root mean square error, 42; see also 
Standard error 

S; see Sum of squares 
Frank S. Salisbury, 159 
Sample frequencies; adjusted by 
Bruy&re method, 124; adjusted by 
least squares, 101 ff.; adjusted by 
Stephan method, 121; adjustment 
to expected marginal totals, 96 ff. 
Sample surveys, 10, 31; of canned 
goods, 240 

Sampling errors, 10, 127, 249 
Sampling ratio, 98 
Sampling variance, 100 
Max Sasuly, 173 
Henry Schultz, 160, 170, 177 
Francis X. Schumacher, 231, 238 
Schwarz-Christoffel inequality, 162 
Second adjustment of parameters, 
52, 180 

Segments of a line, adjustment of, 9,86 
Selected points, method of, 52, 131, 
138 

Seidel, 159 

John D. Shea, 173, 175 
Walter A. Shewhart, 6, 169, 171, 
220; control chart, 220; criterion 
of randomness, 6, 220 
Significance; scientific, 12; statisti¬ 
cal, 12, 30; tests of, 30, 169 

H. Silverstone (and Aitken), 16 



INDEX 


261 


Slice totals, 110 

Small errors; in functions of one 
variable, 37; in functions of several 
variables, 38; numerical example, 41 
John H. Smith, 201 
T. Smith, 159, 160 
Solution without Lagrange multi¬ 
pliers, 62 

Stability (randomness), 6 
Standard deviation, 18; rapid method 
of computing, 151 

Standard error, 12, 15, 21; of a 
curve, 167; of a function of the 
parameters, 167; of a mean, 40; 
of adjusted parameters, 167; of 
calculated ordinates, 229 
Standardized residuals, 22, 23 
Statistical significance, 12, 30 
Frederick F. Stephan, 100, 121, 
123 

R. Meldrum Stewart, 140 
J. Stevens Stock, 32 
Student, 34, 142; Student’s distri¬ 
bution, 169 

Sum check (comer check), 72, 155 
Sum of squares ( S) t 14, 20; effect of 
changes in units, 23; formulas for 
S , 58, 163, 164; removed by 
regression, 175; short expression, 
56; special formulas, 58, 163, 164; 
systematic computation, 156 
Surveying problem, 74 ff. 

Systematic solution of normal equa¬ 
tions, 156; form for computation, 
158 

t test, 34, 169 

Tabular solution of normal equa¬ 
tions, 19, 33, 66, 148, 150; exhibit 
in symbols, 158 
Taylor’s series, 38, 41, 52, 139 
Tests of significance, 30, 169 
Tippett’s numbers, 107, 219, 220, 
252, 255 


Triangle problem, 60; without La¬ 
grange multipliers, 62 
True points, 133 
True value, definition of, 49 
L. B. Tuckerman, 161 

Horace S. Uhler, 140 
Unbiased estimate of <r, 27, 168 
Unequivocal solution of normal equa¬ 
tions, 156 

Vacancy; rate, 32; sample survey oi, 
31 

Variance; analysis of, 29; of resid¬ 
uals, 127; propagation of, 40; 
sampling, 100 

Variance coefficient, 19, 21, 22, 46; 
of a function of unit weight, 21; 
of parameters, 160 
Variate, random, 21 

Abraham Wald, 137 
Weight; definition of, 21; for loga¬ 
rithmic transformation, 45, 200; 
of adjusted parameters, 167; of 
function of adjusted angles, 64; 
propagation of, 40 

Weights of functions after adjust¬ 
ment, 68; numerical examples, 
85, 94; short method of com¬ 
puting, 90 
L. D. Weld, 86 

E. T. Whittaker (and G. Robin¬ 
son), 137, 159, 167 
Benjamin Williamson, 53 
E. B. Wilson, 22 
H. Wouters, 212 

T. W. Wright (and Hayford), 65 

Frank Yates, 172 
Theodore Yntema, 201, 202 

z (Fisher), 28, 30 
Zero sum, method of, 137 





